Posts 


APPLICATION  OF  CERTAIN  STATISTICAL  TECHNIQUES 
TO  THE  ANALYSIS  OP  CORE  SAMPLES 


by 

James  S,  Van  Scoyoc 
If 

B.S.,  United  States  Naval  Academy,  1953 


Submitted  to  the  Department 
of  Chemical  and  Petroleum 
Engineering  and  the  Faculty 
of  the  Graduate  School  of 
the  University  of  Kansas  in 
Partial  Fulfillment  of  the 
Requirements  for  the  Degree 
of  Master  of  Science. 


Library 

U.  6.  Naval  Postgraduate  SahOOl 

Montemy,  California 

PREFACE 


The  author  has  attempted  to  illustrate  in  a  somewhat 
brief  manner  the  application  of  certain  statistical  techniques 
to  the  analysis  of  core  sampling  data.   The  statistical  areas 
of  frequency  distributions,  analysis  of  variances,  and  to  a 
lesser  degree,  sampling,  provide  the  basis  for  the  study. 

A  convenient  reference  system  is  used  in  the  thesis.   All 
equations  are  numbered  as  (Chapter,  Number).   Thus,  equation 
(2-4)  signifies  the  fourth  equation  of  the  second  chapter. 
The  notes  are  listed  together  before  the  bibliography  and 
are  numbered  consecutively  within  each  chapter.   All  mathemati- 
cal notations  and  significant  terms  used  are  listed  and 
defined  in  Appendix  A  for  ready  reference. 
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Colt  and  Mr.  Wendell  Weatherby  of  Iola,  Kansas;  Mr.  Schermerhorn 
of  Tulsa,  Oklahoma;  Mr.  Ray  Plummer  of  Chanute,  Kansas;  and 
Mr.  Carl  Pate  and  the  Oil  Field  Research  Laboratory  of 
Chanute,  Kansas. 

The  author  wishes  to  express  his  appreciation  to  Dr. 
Charles  F.  Weinaug  for  his  overall  direction  of  the  graduate 
program  which  led  to  the  thesis,  and  to  the  Bureau  of  Supplies 
and  Accounts,  United  States  Navy,  whose  sponsorship  made  the 
thesis  possible. 

The  author  is  most  deeply  indebted  to  Dr.  Floyd  Preston 
whose  close  personal  guidance  and  wise  counsel  were  instru- 
mental in  bringing  the  thesis  to  completion. 
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CHAPTER  I 
INTRODUCTION 

Background 

Statistics  may  be  considered  in  two  senses.   One  con- 
ception of  statistics  is  that  of  a  collection  of  numerical 
or  quantitative  data,  i.e.,  figure  data,  such  as  numerical 
data  on  births  or  unemployment.   Statistics  in  the  second 
sense  is  less  well  known,  and  refers  to  the  techniques  of 
analyzing  data  for  decision-making.   It  may  be  thought  of  as 
the  science  of  decision-making  in  the  face  of  uncertainty. 

The  application  of  statistics  in  this  second  sense  to 
petroleum  engineering  problems  is  the  purpose  of  this  thesis. 
The  petroleum  engineer  by  the  very  nature  of  the  realm  in 
which  he  must  operate  is  continually  faced  with  making  de- 
cisions based  upon  fragmentary  and  inconclusive  information 
concerning  the  sub-surface  of  the  earth.   Because  of  the 
extreme  complexity  of  the  geological  processes  of  erosion, 
material  transport,  deposition  and  burial  of  material,  the 
porous  media  forming  petroleum  reservoirs  are  extremely 
heterogeneous.   Physical  properties  can  vary  extensively  from 
place  to  place  within  individual  reservoirs.   The  extreme 
cost  of  sampling  these  reservoirs  forces  the  engineer  to 
construct  mental  and  mathematical  models  from  fragmentary 
information.   Mathematical  statistics  would  seem  to  provide 
a  valuable  tool  for  such  model  construction.   The  utilization 
of  the  science  of  statistics  to  provide  systematic  analysis 
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techniques  to  data  that  is  available  may  provide  additional 
relevant  information  as  an  assist  to  the  decision-making  process, 

It  is  the  purpose  of  this  thesis  to  test  the  applicability 
of  certain  statistical  techniques  for  creating  mathematical 
models  of  property  variation  within  petroleum  reservoirs  and 
to  show  how  certain  statistical  techniques  can  be  used  to 
extend  the  interpretation  of  fragmentary  data.   The  study  is 
limited  to  the  analysis  of  core  samples.   Statistical  techni- 
ques are  applied  to  actual  field  data  obtained  from  five 
fields.   Specific  results  are  presented  to  demonstrate  calcula- 
tional  techniques. 

The  first  approach  in  the  study  is  to  examine  the 
probable  frequency  distributions  of  the  various  properties, 
testing  to  see  if  property  distributions  within  individual 
fields  satisfy  the  Gaussian  normal  distribution  and  if  not, 
to  describe  the  distributions  by  a  family  of  generalized 
frequency  curves  known  as  the  Pearson  system  of  frequency 
curves.   An  initial  study  of  the  distributions  is  considered 
essential,  prior  to  the  application  of  the  other  statistical 
techniques  such  as  sampling  and  analysis  of  variances,  since, 
these  depend  upon  the  nature  of  the  frequency  distribution 
of  the  data  under  study. 

Origin  of  Statistics 

The  origin  of  the  modern  science  of  statistics  may  be 
traced  to  the  mid-seventeenth  century  when  two  astute  French 
mathematicians,  Pascal  and  Fermot,  were  presented  with  a 
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problem  involving  a  game  of  chance  and  the  interpretation 
of  the  probabilities  associated  with  it.   Their  work  led  to 
solutions,  not  only  of  the  problems  proposed,  but  of  more 
general  ones.   The  methods  employed  by  Pascal  may  be  said 
to  represent  the  beginning  of  the  mathematics  of  probability, 
about  which  modern  statistical  theory  centers  today.   The 
publication  by  Laplace  in  1812  of  the  epoch-making  "Theorie 
Analytigue  des  Probabilities"  laid  a  firm  foundation  for 
this  theory. 

Statisticians  in  general  from  Pascal  onward  sought  a 

method  of  describing  the  nature  of  the  distributions  of 

p 
chance  effects.   Table  I   shows  the  most  notable  of  the 

methods  developed  to  portray  chance  effects  from  the  time 

of  Pascal  until  Karl  Pearson  set  forth  his  system  of  generalized 

frequency  curves  in  1895. 

The  usual  purpose  of  frequency  distributions  is  to 
represent  a  sample  of  actual  data  drawn  from  a  much  larger 
or  even  infinite  population.   Even  though  a  sample  may  be 
composed  of  a  relatively  small  finite  number  of  observa- 
tions, it  may  be  reasonably  representative  of  the  larger 
universe  from  which  it  was  drawn.   Since  it  is  virtually 
impossible  to  measure  all  the  items  comprising  a  universe, 
it  is  necessary  to  form  a  notion  of  the  larger  group  from 
the  study  of  a  sample.   Thus  by  constructing  a  hypothetical 
infinite  population  of  which  the'  actual  data  is  regarded  as 
constituting  a  random  sample,  an  understanding  of  the  law 
of  distribution  of  chance  effects  of  the  hypothetical 
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population  may  be  obtained  and  specified  by  a  few  parameters. 

For  example,  to  gain  an  understanding  of  the  porosity 
characteristics  of  an  area,  it  may  be  considered  that  the 
earth  contains  an  infinite  number  of  porosity  measurements 
where  it  is  impossible  to  measure  each  and  every  one  of  the 
porosities.   By  fitting  a  curve  to  a  frequency  distribution 
of  the  actual  data  obtained  from  a  core  sample,  it  is 
attempted  to  describe  what  appears  to  be  the  general  form 
of  the  curve  for  the  entire  porosity  population. 

For  statistical  work  the  normal  or  Gaussian  curve  is 
probably  the  best  known  and  most  heavily  relied  upon 
frequency  distribution  of  those  shown  in  Table  I,  particularly 
in  the  theory  of  sampling.   However,  this  distribution 
function  does  not  apply  to  skewed  frequency  distributions, 
although  numerous  sets  of  data  defy  the  normal  curve  and 
exhibit  markedly  skewed  distributions.   The  Gaussian  school 
of  statisticians  regarded  skewness  as  a  by-product  of 
sampling  and  believed  that  skewness  could  be  made  to  disappear 
completely  if  an  infinite  number  of  observations  were  available. 

With  the  recognition  that  the  normal  curve  was  not 
sufficient  to  characterize  all  natural  observations,  it 
became  apparent  that  it  was  necessary  either  to  devise 
methods  of  describing  the  most  conspicuous  departures  from 
the  normal  distribution  or  to  devise  generalized  frequency 
curves  to  describe  distributions  as  they  actually  exist  in 
the  observational  sphere. 

Karl  Pearson  followed  the  latter  course  and  showed  that 
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a  set  of  frequency  curves  could  be  obtained  by  assigning 
values  to  the  parameters  in  a  certain  first  order  differential 
equation  which  has  its  basis  in  the  theory  of  probability. 
This  approach  is  covered  in  Chapter  II  and  the  application 
of  these  curves  to  the  actual  field  data  is  shown  in  Chapter  III, 

Core  Samples 

The  taking  of  core  samples  from  a  reservoir  has  been  an 
accepted  practice  for  the  past  hundred  years.   At  first,  v/ell 
samples  and  coreshad  but  one  purpose — to  locate  oil.    Thus 
it  was  necessary  to  take  a  sample  from  every  well  drilled. 
Today,  however,  with  the  advent  of  other  techniques  such  as 
electric  logging  to  aid  in  locating  the  oil  the  primary 
reason  for  taking  samples  has  shifted  to  serve  as  a  source 
of  information  about  the  reservoir  and  its  contents.   At 
present,  core  samples  provide  certain  numerical  parameters 
by  which  the  field  may  be  described.   The  most  common  is 
the  arithmetic  or  weighted  average  of  the  various  properties. 
The  range  and  variances  of  the  distribution  of  the  properties 
are  other  easily  obtained  and  useful  parameters  describing 
a  field. 

A  simple  histogram  showing  the  distribution  presents 
visually  the  characteristics  of  a  field.   A  cumulative 
frequency  curve  on  graph  paper  will  present  the  distribution 
and  allow  easy  determination  of  such  parameters  as  the  median 
and  possibly  the  mode.   Prom  a  cumulative  frequency  curve  an 
estimate  of  the  percentage  of  a  distribution  which  is  above 


a  specified  minimum  point  may  be  obtained.   The  range  of  a 
variable  within  any  set  quartile  is  likewise  easily  deter- 
mined.  In  summary,  graphical  methods  of  statistical  data 
presentation  permit  certain  numerical  parameters  to  be 
obtained  with  relative  ease. 

Since  it  is  no  longer  necessary  to  take  samples  from 
every  well,  it  is  desired  to  determine  the  number  of  wells 
that  should  be  core-sampled  to  provide  information  needed 
with  an  acceptable  probability  of  obtaining  reliable  results. 
At  present  the  number  of  core  samples  to  be  taken  is  deter- 
mined somewhat  intuitively  with  a  wide  variation  of  opinion 
as  to  what  is  the  necessary  number.   Are  there  statistical 
techniques  available  to  serve  as  a  guideline  in  determining 
how  much  information  is  needed  and  how  such  data  should  be 
interpreted? 

The  method  of  interpretation  of  core  data  may  be  paramount 
since  raw  data  by  itself,  irregardless  of  the  amount  available, 
often  supplies  much  information  that  is  irrelevant  and 
immaterial.   It  is  the  object  of  statistical  processes 
employed  to  exclude  this  irrelevant  information  and  to 
isolate  the  whole  of  the  relevant  information  contained  in 
the  data. 

Data  and  Techniques 

The  science  of  statistics  consists  of  (l)  collecting, 
(2)  presenting,  and  (3)  analyzing  quantitative  data.   The 
data  used  for  this  study  consist  of  the  physical  properties 
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of  oil  fields,  namely  permeability  (K),  porosity  (0),  oil 

saturation  (S  ),  and  water  saturation  (S  ),  as  obtained  from 
v  o  w 

4 
core  samples  taken  from  five  different  fields.   Table  II 

shows  the  type  and  amount  of  data  considered.   The  five  fields 

will  be  referred  to  as  Field  1,  Field  2,  Field  3,  Field  4, 

and  Field  5«   The  numbering  system  has  no  significance  other 

than  the  fields  being  numbered  in  sequence  as  data  were 

obtained  for  this  study. 

Fields  1,  2,  4,  and  5  are  located  in  southeast  Kansas 
while  Field  3  is  in  northeast  Oklahoma.   Field  1  is  in 
County  "A"  while  Fields  2,  4,  and  5  are  in  County  "Bn. 

The  core  samples  for  all  five  fields  were  taken  and 
analyzed  by  the  same  laboratory  with  the  same  coring  and 
analyzing  techniques  used  for  all  fields.   Thus  even  though 
there  may  have  been  errors  made  in  arriving  at  the  absolute 
values  of  measurements  of  the  different  properties,  especially 
with  regard  to  the  fluid  saturations,  the  errors  may  be 
considered  consistent,  and  a  relative  comparison  of  the 
data  may  be  made  with  a  certain  degree  of  confidence. 

TABLE  II  Summary  of  Core  Sampling  Data 

Field  No*  of  Wells  No-  of  K  No-  of  0  No-  of  s0  No*  of  s 

Samples   Samples   Samples    Sample sw 


1 

40 

1303 

794 

794 

794 

2 

14 

630 

345 

345 

345 

3 

7 

195 

132 

132 

132 

4 

19 

- 

316 

316 

201 

5 

101 

- 

1673 

1672 

1129 
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Standard  coring  techniques  were  employed  in  taking  the 
vertical  cores  of  approximately  20  feet  in  length.   Measure- 
ments of  each  property  were  made  approximately  every   six 
inches  in  the  oil  productive  section  with  approximately 
twenty  samples  being  obtained  for  each  well.   The  depth  to 
the  pay  zone  for  the  different  fields  varied  somewhat,  but 
they  all  could  be  considered  as  shallow  fields  with  pay 
zones  at  depths  between  six  hundred  and  a  thousand  feet. 

The  well  spacing  in  most  instances  was  approximately 
four  hundred  feet.   In  each  field  more  than  50  percent  of 
the  wells  were  cored,  with  coring  data  available  for  all 
properties  with  the  exception  of  permeabilities  for  Fields 
4  and  5. 

A  map  of  each  of  the  first  three  fields  showing  the 
locations  of  the  wells  cored  is  given  in  Appendix  C. 

The  above  mentioned  data  were  used  for  the  following 
statistical  studies  which  constitute  both  the  method  of  and 
justification  for  this  thesis: 

1.   Analytical  Fitting  of  Data  to  the  Pearson  Generalized 
Frequency  Curves.   The  study  includes  the  technique  for 
selecting  the  appropriate  Pearson  type  curve,  for 
fitting  the  data  to  the  selected  curve  and  for  measuring 
the  goodness  of  fit  of  the  data  to  the  curve.   In  addition 
the  data  were  fitted  to  the  normal  or  Gaussian  Curve 
and  its  goodness  of  fit  determined.   Permeability  data 
were  converted  to  logrithms  and  the  resultant  distribu- 
tions were  analyzed  by  the  above  methods. 
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2.   An  Analysis  of  Variance  Study  of  Well  Property  Means 

and  Variances. 
5.  Use  of  Certain  Sampling  Methods  as  a  Way  of  Estimating 

Certain  Population  Parameters  from  Point  Estimates. 
4.   The  Application  of  Additional  Statistical  Techniques  to 

Core  Analysis  Are  Briefly  Discussed. 

Statistics,  in  its  many  ramifications,  is  an  exceedingly 
complex  subject  and  much  too  involved  to  he  thoroughly 
covered  in  a  paper  of  this  nature.   The  techniques  presented 
are  by  no  means  the  only  methods  of  analysis  available.   In 
order  to  permit  the  reader  without  a  thorough  background  in 
theoretical  statistics  to  gain  an  understanding  of  the 
material  presented,  the  discussion  of  the  theoretical  proofs 
and  principles  involved  have  been  held  to  a  minimum. 


CHAPTER  II 
FREQUENCY  CURVES 

Introduction 

One  of  the  most  important  practical  problems  in  mathe- 
matical statistics  is  the  obtaining  of  a  relatively  simple 
yet  accurate  representation  of  the  frequency  distribution  of 
any  set  of  data  under  consideration.   Some  knowledge  of  the 
frequency  distribution  of  a  set  of  data  should  be  obtained 
before  a  statistical  analysis  is  attempted. 

There  are  essentially  three  methods  of  describing 
frequency  distributions  of  one  variable;  namely,  the  graphi- 
cal method,  the  method  of  averages  and  dispersions,  and  the 
method  of  theoretical  frequency  functions  or  curves.   These 
three  methods  of  describing  frequency  curves  will  be  briefly 
compared  to  their  relative  merits. 

The  graphical  method  allows  a  large  amount  of  data  to 
be  condensed  to  an  easily  presentable  form.   An  inherent 
weakness  of  this  method  is  the  inability  to  quantitatively 
compare  distributions  of  cwo  or  more  sets  of  data.   One  may 
state  that  two  distributions  are  somewhat  the  same  or  that 
they  are  somewhat  different,  but  the  degree  of  sameness  or 
difference  cannot  be  quantified  by  observation  alone.   This 
lack  of  numerical  description  of  the  distribution  by  the 
graphical  method  precludes  its  use  as  a  comparing  method 
except  in  the  most  elementary  studies. 

Figure  1  shows  two  histograms,  one  representing  the 
frequency  distribution  of  the  porosity  of  Field  1  arid  -che 
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The  second  method,  involving  the  use  of  averages  and 

dispersion,  does  give  a  numerical  description  of  the  data 

in  Figure  1,  but  it  does  not  give  a  functional  relation 

between  the  values  of  the  variable  X  and  the  corresponding 

frequencies.   The  numerical  description  in  terms  of  X  and 

a  where: 

_  Field 

X  =  Arithmetic  mean 


X 

o 

19. 

\% 

1. 

,46 

19. 

1% 

1. 

.17 

1 

o   =  Standard  deviation        2 

This  numerical  description  would  further  indicate  that  the 
distributions  are  nearly  the  same  which  again  would  be  some- 
what misleading. 

The  third  method,  an  analytical  method  of  describing 
frequency  distributions  shows  that  Field  1  is  a  generalized 
frequency  distribution  of  the  Pearson  Type  IB,  whereas  Field 
2  is  of  the  Pearson  Type  XVB  and  may  be  described  by  the 
parameters  cu2  =  .725,  <5  =  -.294,  and  gu2  =  1.86,  6  =  .141, 
respectively.   This  method  indicates  that  the  distributions 
of  the  two  fields  are  not  similar  which  is  contrary  to  what 
one  itfould  assume  by  use  of  the  first  two  methods  of  describ- 
ing frequency  distributions. 

The  reader  should  not  be  unduly  concerned  at  this  point 

p 
as  to  how  the  values  of  cu  and  6  were  determined  and  what 

Type  Xq  and  IVU  mean  as  the  following  sections  will  present 

a  thorough  discussion  of  Pearson's  generalized  frequency  curves, 

Pearson's  Generalized  Frequency  Curves 

After  it  was  recognized  that  the  Gaussian  or  normal 
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curve  failed  to  describe  the  distribution  of  many  of  the 
observed  data,  Pearson  proceeded  to  develop  generalized 
frequency  curves  that  could  characterize  the  various  types 
of  unimodal  frequency  distributions  encountered. 

In  deciding  on  a  system  of  curves  for  describing 
frequency  distributions,  it  was  realized  that: 

1.  Any  expression  used  should  be  a  graduation  formula,  i.e., 
it  must  remove  the  roughness  of  the  data. 

2.  An  expression,  should  not  involve  too  many  high  moments 
to  calculate  constants,  for  thereby  accuracy  is  reduced. 

3.  There  should  be  a  systematic  method  of  analysis  applicable 
to  all  possible  types  of  frequency  distributions. 

Then,  considering  the  most  obvious  characteristics  of  frequency 
distributions,  it  may  be  considered  that  they  generally  start 
at  zero,  rise  to  maximum,  and  then  fall  at  the  same  or  often 
at  a  different  rate.   At  the  end  of  the  distribution  there  is 
often  high  contact.   Mathematically,  the  above  implies  that 
a  series  of  equations  Y  =  f (x)  or  Y  =  f (t),  must  be  chosen 
so  that  in  each  equation  of  the  series  dy/dx  or  dy/dt  =  0  for 
certain  values  of  x  or  t,  namely  at  the  maximum  and  at  the 
end  of  the  curve  where  there  is  contact  with  the  axis  of  x  or  t. 

The  above  suggests  that  the  frequency  function  may  be 
represented  as  a  solution  of  the  differential  equation: 

(2-11  ^  =  Y(a-t) 

Kd  ±)  dt    f  (t) 

since: 

a.   For  a  value  of  t,  t  =  a,  dy/dt  =  0  and  the  required 
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maximum  is  obtained  and, 
b.   As  Y  approaches  zero,  the  derivative  dy/dt  also  approaches 
zero  thus  giving  contact  at  one  end  of  the  curve. 

Assuming  that  f (t)  may  be  expanded  in  a  converging 
power  series,  equation  (2-1)  may  be  written: 

(2-2)  i^  =  , a  "  t 

KeL      '  Y  dt   b^  +  bn  t  +  b0  ,  2  + 

o    1      2  t 


where  the  mean  of  the  distribution  is  taken  as  the  origin, 
and  the  abscissae'  are  measured  in  units  of  the  standard 
deviation  such  that: 

(2-3 )  t  -  distance  from  origin 

The  above  suggests  that  significant  frequency  functions 

Y  =  f (t)  may  be  found  among  the  solutions  of  (2-2)  subject  to 

p 
the  following  restrictions: 

(2-4)  ao  =  /    f (t)  dt  =  1 

b2 


(2-5)  cu  =  /  -1     t  f(t)  dt  =  0 

b2 


b-,    0 
(2-6)  a0  =  /  x     td   f (t)  dt  =  1 

*2 


where: 

(2-7)  a  =  -£  .  /   tn  f  (t)  dt 

C      —co 


16 
and, 
(2-7a)  a  =  T^-       cu  =  — £ 

See  Chapter  III  for  definition  of  the  moments  u.?,  [a..,  ...  p,  . 
Clearing  (2-2)  of  fractions,  multiplying  through  by  tn  and 
integrating  over  the  range  r  to  s  (where  r  and  s  are  the 
extremes  of  the  range  of  variation  for  t)  with  respect  to  t 
gives '. 

(2-8)  [a/tnYdt-bn/tndy-b1/tn+1dy-b0/tn+2dy-/tn+1Ydt  ]   =  0 

°       ±  d  t=r 

But  through  integration  by  parts: 

S  S 

(2-8a)       [/tn  dy  ]   =  [tn  Y  -  n/t11"1  Y  dt  ] 

t=r  t=r 

and  if  the  frequency  function,  when  multiplied  by  tn  vanishes 

4 

at  the  limits  of  the  distribution,  r  and  s,  we  have  that: 

S 
(2-9)  /   tn  dy  =  -n  N  a  , 

t=r  n  x 

which  leads  to  the  recursion  formula  for  moments:-' 

(2-9a)  an  a+n  cc^  bQ  +  (w-1)  a^  +  (w-2)  anfl  \>2   =  an+1 

Giving  n  successively  the  values  0,  1,  2  ...,  from  (2-8)  and 

(2-9)  noting  that  N  the  total  frequency  cancels  out  and 

c 
letting  a  =  1,  ou  =  0,  a?  =  1,  one  obtains: 
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(2-10) 


a  + 

a  + 
a  + 


bo  + 


3  b2 
+  5  <*4  b2 


3bQ  +  4a. 
n  a  +  n  an-1  b0+(n+l)  a^  +  (n+2)  aR+1  b2  + 


=  0 
=  1 

=  a3 
=  a 


4 


=  a 


n+1 


Assuming  that  f (t)  converges  so  rapidly  that  terms  involving 
the  third  and  higher  powers  of  t  may  be  neglected,  a  simul- 
taneous solution  of  (2-10)  yields: 


-  ou 


(2-11) 


a  = 


b_  = 


3 
2(1+26)' 

2+6 
2(1+26)' 


a 


3 

Dl  ~  2(1+26) 


b2  ~  2(1+26) 


where, 


7 


(2-12) 


6  = 


2  a,.  -3  a,   -  6 


'4 


3 


The  value  of  a,  where: 
(2-13) 


a  = 


a^  +  3 


2(1+26) 


represents  the  distance  between  the  mean  and  the  mode,  which 
is  defined  by  Pearson  as  the  skewness  of  the  distribution. 
Thus  from  the  differential  equation: 


(2-2) 


dy   _  a  -  t 
Y  dt    f (t) 


a  -  t 


bQ  +  bx  t  +  b2  t' 


which  has  its  basis  in  the  theory  of  probability,  the  para- 
meters a,  b  ,  b.,,  and  b~  have  been  determined  in  terms  of 
the  moments  and  expressed  in  terms  of  a,  and  6.   This 
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transformation  is  desirable  to  permit  the  use  of  the  (ou,  6) 
chart  for  identification  of  the  appropriate  type  curve  that 
fits  a  set  of  data.   With  the  above  parameters  defined, 
Pearson's  family  of  generalized  frequency  curves  may  be 
developed. 

In  the  family  of  curves  there  are  three  main  types, 
nine  transitional  types  which  are  special  conditions  of  the 
three  main  types  and  the  normal  curve. 

Integration  of  the  Differential  Equation 

For  6/0,  bp  /  0,  the  denominator  b  +  b,  t  +  b~  t   is 
a  quadratic  which  can  be  written  in  the  form  b?(t  -  r-,)  (t  -  r?). 

Thus, 

(2-14)   1  dy   a  -  t =  a  -  t 

Y  dt   bQ  +  b]_  t  +  b2  t2   b2(t  -  r1)(t  -  r2) 

where: 

-  b-,  +  v/b-12  -4bb0  -  b.  -  ^b1  2   -  4b  0b^ 

(2-15)  rn  =  1  1 °-^   and  rQ  =  X     X ^ 

1        2  b2  d  2  b2 

Upon  substitution  of  cl.  and  6  in  (2-15),  for  a,  b  ,  b.,,  and  b?: 

a  +^a_2  -  46(6+2)    -  a,  +  y/E 

(2-15a)    r,  =  -  -2- ^ =  -2 

1  2  6  2  6 

where: 

D  =  a  2  -  46(6+2)  and 
(2-15b)  

-^cu2  -  46(6+2)    -  cu  -,/D 

r  =  -  a  2 = 2 

d  2  2  6  2  6 
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Q 

Also,  by  the  method  of  partial  fractions, 

(2-16)     SL=J =  -i  [— L-  +  JL-]  where 

b2(t-r1)(t-r2)   b2  t-z^   t-r2 


a-r-,         a-r0 
(2-17)  A2  = and  B  = 


rl"r2        r2"rl 

Hence  from  (2-14): 

1       1  (a~ri)    dt      l  (a-r2)    dt 
(2-18)   Y  dy  =  B~  (rrr2)  Tt^T  +  ^  (^-r^  T^^" 

Upon  integration  of  (2-l8): 

a— r  a— r 

(2-19)   Log  Y  =  ^-(^-1-)  log(t-r1)  +  ^-(^-iL)  log(t-r2)+log  C 

Hence:  -,   a-r-.         n   a-r^ 

l_C L_)       i_f 2__\ 

b0vr,-r  '       b0vr0-r.1/ 

(2-20)      Y  =  C(t-r1)  2  1  2  (t-r2)  2  2  1 

(2-21)      Let  m,  =  ^(^)  and  m2  =  ^(5^) 

Thus  (2-20)  reduces  to: 

mn       mQ 
(2-22)         .   Y  =  C(t-r1)  1    (t-r2)  d 

Upon  substitution  of  ol  and  6  in  (2-21) 

1+6   cu     1+26 
m1  =  (-£-)(-p)    -    ( — 3-)  and 

(2~22)  1+6  a,     1+26 

-2  =  -  (-5-)  g)  -  (-T-) 


For  -4  <  6  <  0,  the  r's  are  real  and  opposite  in  sign;  for 
6  >  0,  and  cu  <  46(6+2),  the  r's  are  complex;  and  for  6  > 


0 
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and  45(6+2)  <  cu      the  r's  are  real  and  of  the  same  sign. 
These  three  conditions  with  the  additional  condition  that 
QLj,   /  0  establish  the  criteria  for  the  three  main  types  of 

Pearson's  frequency  functions  designated  Type  I,  IV,  and 

9  /    \    2 

VI.    The  boundaries  of  these  areas,  the  curve  (2+36)  cu 

4(1+26)   (2+6)  which  intersects  the  Type  I  and  Type  VI  areas 

and  the  line  6  =  -  1/2  contains  the  points  which  correspond 

to  the  transitional  types.   The  numbering  system  of  the  main 

types,  i.e.,  I,  IV,  VI,  is  that  established  by  Pearson  and 

is  used  in  this  study  to  provide  a  standard  reference  to 

other  literature  on  the  family  of  curves. 

Analysis  of  Data 

For  the  purposes  of  analyzing  the  field  data  under 
consideration,  it  will  be  shown  that  Pearson's  three  main 
types  of  curves:   Type  I,  Type  IV,  Type  VI,  plus  the 
transitional  Type  III  and  the  normal  curve  will  suffice  to 
describe  the  data.   Therefore,  the  development  of  the  other 
transitional  types  will  not  be  described  in  detail.   Only 
the  equations  and  conditions  are  shown  in  Table  III.     On 
the  following  pages  derivations  are  given  for  the  five  types 
of  frequency  distribution  functions  used  in  this  thesis. 

v..  Type  I 
When  the  r's  in  (2-14)  are  opposite  in  sign,  (2-22)  is 


written  as: 


m,        m0 
(2-24)  Y  =  C(t-r, )  1   (r0-t)  d 
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Equation  (2-21!)  is  called  Type  I  of  the  Pearson  system.   The 
conditions  on  cu   and  5  are: 

Oj  /  0,  -1<  6  <  0  (6  /  -  1/2),  (2+36)  a^2  /  4(l+26)2  (2+6) 

C  is  determined  by  setting  the  area  i.e.,  the  total  probability 

equal  to  unity  or, 

r0  m,        mQ 

(2-25)         C  /  d    (t-iO  x   (r  -t)  d  dt  =  1 

rl 

Substituting: 

t-r- 
(2-26)  W  =  — ±- 

P2  rl 


1  m.,        mn  +  m0  +  1      m0 
(2-27)     C  /  W  x  (rp-r-j)  x    d     (l-W)  d  dW  =  1 
o 

Hence  from  the  definition  of  the  Beta  function: 

m1   +  m0   +1 
(2-28)      Cdv,-^)  -1         p(rai  +  1,  m2  +  1)  =  1 

Thus  C  may  be  expressed  either  in  terms  of  the  Beta  function, 

or  alternately  in  terms  of  the  Gamma  function,  through  use 

12 
of  an  identity  given  by  Whitaker  and  Watson: 


(2-29)      C 


13 1 


nu+iru+1 
(r2"rl)        P  (n^+l^g+l) 


1  r(m1+m2+2) 


nu+nu+1 

(rg-^)  -1  *   r  (1113+1  )r(m2+i) 


The  other  parameters  are: 
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-ou  -j-y/D  -o^  -y^ 

rl  =   2l    '  r2  =   2  6 


m  =  -  (ij£)(l  5)  "If       mP  -  "  (T'ft  +  5)  -1 


for 


o^  >  0,  r2  <  0  <  r2  and  \r1\    <  |r2| 


The  range  of  the  curve  is  (i\.,  r?).   The  curve  will  be  U- 

shaped  if  both  m's  are  <  0,  J-shaped  if  the  m's  are  opposite 

14 
in  sign,  and  bell-shaped  if  both  m's  are  >  0. 

Type  IV 

Conditions:   ou  /  0,  6  >  0,  ou2  <  4(6+2).   The  condi- 
tions imply  that  the  r's  in  (2-14)  are  complex  and  therefore 
the  second  main  type  of  curve,  Type  XV,  may  be  determined. 
Thus  (2-15)  can  be  written: 

-ou  +  \/^Q  cu         r^ 

r.,  =  —  p  A =  -r  +  iS  where  r  =  p4  and  S  =^-^r 


-ou  -i/^ 

r  =  — ^ p. =  -r  -iS 

L2  2o 


VI  -VI 

m^  =  — =■  -  m  m^  =  —^-  -   m 


where: 


a. 


v  ,  _2(i|i)  _^_  and  m  .  a±|2. 


Thus: 

nu       m~ 
(2-22)         Y  =  CCt-^)  x      (t-r2)    becomes 
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vi 
— m         2 
(2-30)       Y15  =  C[(t+r)2  +  S2]    (|g=|§) 

-   .    16  ' 
and  since: 

ci 

/a-biN      c  tan"  b/a   ~c(5  -  tan   a/b) 
Wbi'    ~  e  ~  e  d 

the  frequency  function  can  be  written: 

(2-31)    Y  =  C[(t+r)2  +  S2]"   e'V  tan"  hr*      e^ 


To  determine  C  on  setting  the  area  of  the  curve  over  the 
interval  (-co,  <»),  equal  to  unity,  Craig  '  shows  that: 

.,,      q2m-l 
^2>  C  3  =  G(2m-2,  V  ) 

where: 

(2-33)       G(2m  -2,   V  )  -  /"  sin(2m"2)  0  e  V  0  d  0 

o 

The  term  0  is  defined  as: 

(2-33a)  0  =  (g  -  tan"1  S  ) 

The  function  G(2m  -2,  v  )  is  obtainable  in  tabular  form  from 
Pearson. 

For  this  thesis,  the  function  G(2m  -2,    v  )  was  generated 

by  a  special  numerical  integration  procedure  involving 

1Q 
Gaussian  coefficients.  ^ 

Type  VI 


2 

conditions  imply  that  the  r's  in  (2-14)  are  real  and  of  the 


Conditions:  cu,   /  0,  6  >  0,  a-T   >  46(6+2).   The 


2h 

same  sign,  thus  the  third  main  type  of  curve,  Type  VI,  is 

obtained. 

m1  m0 

(2-22)  Y  =  C(t-r1)  x   (t-r2)  d 

An  alternative  simplified  form  is: 

mc       m, 
(2-31!-)  Y  =  C(Z  2)(Z  -  a)  1 

17 
Craig   shows  that: 

(2-35)     C1?  = ELSE) ■ 

x  ^    '  (,m-|+mp+l; 

rfn^+l)  r(-m2  -m-j^  -1)  a 


where: 

(2-^6)  Z  =  t-r 


2 


(2-37)  a  =  r±-r2 

Range  of  curve  is  (r-,,  «>). 


Normal  Type  Curve 

Conditions:   ou  =6  =0.   The  original  differential 
equation: 


(2-2)  i  ^  =  a  -  t 


Y  dt   b    b-,  t  +  b0  t2 
o    1      2 


reduces  to: 

(2-38)  i  &  =  -t 


when: 


Y  dt 


av  =  5  a  0,  since  from  equation  (2-11) 
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a,    b.,,    bp   =  0  and  b     =  1 


Hence: 

(2-38) 

f£  =  -t  dt 

Upon  integration: 

(2-39) 

t2 
Log  Y  =  -  f-  +  log  C 

(2-40) 

-  t2 
Y  =  C  e  2 

where: 

^ 

c20  =  H 

-v/2lr  a 
where  N  =  total  number  of  observations. 

Type  III 

Conditions:   cu  /  0,    5=0.   From  equation  (2-11)  for 
6  =  0: 

■  a   -   2  bl  =  ^ 

bQ  =  |  -  1  b2  =  0 

Therefore  the  differential  equation: 

(2-2)  1  fly  _     a  -  t 

Y  dt   b^  +  b-,  t  +  b0  t2 
o    1      2 

becomes:                           Q 
(2-41)  i  2Z  a  2 =  2 

*«     i+   Jt        1+   £t 

'  21 
which  yields  after  integration: 


*  -1   -.2 
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(2-42)  Y  =  C(l  +  -g-   t)  -> 


...  2   -1"     rr-  t 


e   ^ 


Let  A  =  |- 

and: 

A  -1   -At 
(2-43)  Y  =  C  i(Art)      e  ' 


A 


where: 


CX  =  C1 


/" 


and  Craig  °  shows,  that: 


A2 
(2-44)  C  15  -  -A— 


1    "  A2   2 
eA  TAd 

Since  it  is  the  purpose  of  this  thesis  to  utilize 
Pearson's  frequency  curves  in  fitting  the  field  data  under 
consideration,  rather  than  to  develop  the  mathematical 
theory  upon  which  the  curves  are  based,  which  is  a  complete 
thesis  in  itself,  only  limited  discussion  of  the  development 
of  the  curves  has  been  presented.   For  a  more  complete 
coverage  of  the  development  of  these  curves  the  following 
sources  may  be  consulted.   The  foregoing  discussion  and 
derivations  represent  a  synopsis  of  information  given  in 
the  following  references: 

a.  Annals  of  Mathematical  Statistics,  Volume  VII,  1936,  UA 
New  Exposition  and  Chart  for  the  Pearson  System  of 
Frequency  Curves"  by  Cecil  C.  Craig. 

b.  Frequency  Curves  and  Correlation,  by  W.  P.  Elderton,  C. 
Cambridge,  1938. 
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c.  "Karl  Pearson's  System  of  Generalized  Frequency  Curves", 
by  Arnold  M.  Wedel,  Thesis,  University  of  Kansas,  1948. 

d.  Handbook  of  Mathematical  Statistics,  "Frequency  Curves", 
by  H.  C.  Carver,  pp.  92-119,  1924. 

The  entire  family  of  curves  is  shown  in  Table  III.   The 
main  types  are  shown  followed  in  order  by  the  transitional 
types  associated  with  a  particular  main  type.   This  study 
has  been  limited  to  the  use  of  the  three  main  types,  the 
transitional  Type  III,  and  the  Normal  Curve.   These  five 
curves  adequately  describe  the  field  data  used  for  this  study. 

(a  2,    6)  Chart 

In  the  course  of  the  preceeding  discussion  a  set  of 
conditions  for  the  various  types  of  functions  has  been 
established  in  terms  of  cu   and  6,  parameters  which  may  be 
readily  calculated.   The  numerical  values  of  these  two 
parameters  determine  the  Pearson  curve  appropriate  to  a 
particular  distribution.   The  conditions  for  each  type  of 
curve  are  summarized  in  Table  III. 

An  (cl  ,  6)  chart  which  gives  visual  presentation  of 
these  conditions  and  an  automatic  means  for  type  identifica- 
tion is  relatively  easy  to  construct. 

In  addition  to  the  lines,  6  =  -1,  6  =  -  l/2,  6  =  0,  6 
=  2/5,  and  o~  =  0,  the  chart  contains  only  the  curve: 

(2-45)  av2  =  46(6+2) 


TYPE 


THE  CURVES. 
EQUATION    I     CONDITIONS   l| 
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REMARKS 


«ty-fio  (Limited  range,  skew, 

_/<.<r<;4  £"d-J    u3Ually  bell-shaped 
"*<»*■  jbut  may  be  U-shaped 

Y=  tft-^rbii-V'*    fe<J*MS*S&'*sfc0ov   J-shapedo  Range 


=    <3 


y^c^ti 


-\<s**>  f>-£ 


Limited  range,    syn- 
metrical^usually  bell- 
shaped  but  U-shaped 
when    ctfC  1.8, Range 
is      (-s,   sO  • 


«'-l 


y*c{*+tj    m 


~«-t 


Unlimited  range   in 
one  direction,   skew; 
bell-shaped,   but  may 
be   J-shaped.     Range   is 
(        -A,<x?  I  • 


/*    €{*-/*,) 


-3«- 


J-ahaped.     Rang©  frcm 
infinite   ordinate  at 
rx  to  finite   ordinate 
at      t   s  ra  • 


/«  c(-^-*) 


~it~ 


"i  <S  <■  d 


fi+3SM'3-*>v//<-is)'$*0 


Range  from  t  —  r»  to 
infinite  ordinate  at 
rx»     J-shaped  • 


y  =  e 


~t-l 


J -shaped  with  range 

(  -1,  43©  )  q 


/-  *&27' 


$* 


~.i_ 


J-shaped  with  range 


y*  cJcuaJ+s] 


Unlimited  range r 
skew,  bell-shaped. 


</3  ^o 


Unlimited  range   in 
one  direction,   bell- 
shaped.  Range   is  (-r- ,<%?), 


y,  c(**+S) 


~  >* 


Unlimited  range,    sym- 
metrical, b6ll-shaped» 


y*c* 


~tz 


^^$^a 


Unlimited  range,    sym- 
metric al, bell-shape d» 


y*c**-(**o<)x> 


££&*2l 


Unlimited  range  in   onen 
direction,   skew,   bell- 
shaped  but  may  be  J- 
shaped* 


r*  cC*'~-\ 


-2^. 


J-shaped  with  finite 
ordinate  at  t  =  ri, 
Range  is   (rx,  <&  J. 


TABLE  III  Summary  of  Pearson  System  of  Frequency  Curves 
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on  which  the  points  corresponding  to  the  Type  V  function  lie, 
and  the  curve: 

(2-46)  (2  +  36)  =  4(1  +  26)2  (2  +  6) 

on  which  the  points  corresponding  to  the  distribution 

?? 

functions  of  Type  VII,  IX,  X,  and  XI  are  found. 

Construction  of  (cu  ,  6)  Chart 

p 
Point  ok   =.  0,  6  =  0  satisfies  the  conditions  for  the 

normal  curve  and  is  the  starting  point  for  constructing  the 

graph.   The  lines  6  =  -1,  6  =  -  1/2,  6=0,  6  =  2/5,  and  a~. 

=  0,  are  easily  constructed. 

For  the  equation: 


(2-45) 


a d   =  46(6+2) 


6 

a  * 

3 

0. 

0. 

.1 

.84 

.2 

1.76 

•  3 

2.73 

.4 

3.84 

For  the  equation: 
2 


(2-46)   (2+36)  cu*  =   4 (1+26 r  (2+6) 


6 

cu 

3 

0. 

4. 

.2 

6.634 

.4 

9.72 

-.2 

I.85 

-.4 

.32 

-.5 

0. 

-.6 

1.12 

Note  when  6  =  -  2/5,  the  expression  (2+36)  =  0,  therefore 
the  line  for  this  equation  approaches  -  2/5  asymtopically. 
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Figure  2  presents  visually  the  family  of  Pearson's 

p 
curves  in  terms  of  ou   and  6.   It  is  a  simple  matter  to 

determine  the  type  of  curve  that  fits  any  set  of  data 

2 
merely  by  determining  the  cu     and  6  values  for  the  data  and 

then  entering  the  chart  with  these  values. 

The  subscript,  B,  on  the  chart  refers  to  bell -shaped 
curves,  the  subscript,  J,  refers  to  J-shaped  curves,  and 
the  subscript,  U,  to  U-shaped  curves. 

The  points  for  6  <  -1,  correspond  to  no  frequency 
functions,  they  fall  in  the  "Impossible  Area. " 

Pearson  designated  as  heterotypic  those  members  of  his 
system  for  which  the  eighth  movement  failed  to  exist.   (In 
such  a  case  the  standard  deviation  of  the  fourth  moment  in 
samples  would  be  infinite. )   This  area  is  established  by 

■a  £  .4. 
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CHAPTER  III 
APPLICATION  OP1  PEARSON  CURVES 


Introduction 


The  brief  theoretical  discussion  of  the  Pearson 
frequency  curves  having  been  presented  it  is  now  useful  to 
examine  how  these  curves  may  be  applied  to  the  field  data 
under  consideration. 

From  Table  I  there  are  eighteen  sets  of  data  available 
for  the  five  fields.   The  Pearson  system  of  frequency 
curves  will  be  applied  to  find  a  good  theoretical  fit  for 
each  given  observed  distribution. 

What  is  the  value  of  frequency  curves?  A  normal  curve 
fitted  to  a  given  set  of  data  is  to  determine  xvhether  or 
not  the  data  are  normally  distributed.   If  the  distribution 
is  normal  then  use  may  be  made  of  the  extensive  body  of 
sampling  theory  applicable  to  normal  populations.   Compara- 
tively little  is  known  concerning  sampling  from  non-normally 
distributed  populations. 

when  data  are  distinguished  as  non-normal,  there  may 
be  further  advantages  in  fitting  a  non-normal  curve  to  the 
data.   Such  a  curve  may  serve  to  smooth  the  histogram  and  may 
thus  permit  a  more  accurate  determination  of  the  relative 
frequencies  of  the  population  from  which  the  sample  was 
taken.   The  identification  of  the  distribution  of  the  given 
data  with  a  particular  frequency  curve  may  also  serve  to 
distinguish  them  from  other  data,  the  distribution  of  which 
is  identified  with  a  different  frequency  curve.   These  are 
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the  two  principle  reasons  for  fitting  non-normal  frequency 
curves.   Also,  if  a  particular  type  of  non-normal  distri- 
bution occurs  with  sufficient  frequency,  this  fact  will 
serve  as  an  incentive  for  the  creation  of  the  appropriate 
sampling  theory. 

Fitting  the  Data 

The  arithmetical  labor  involved  in  fitting  a  set  of 
observed  data  to  a  frequency  curve  is  lengthy  and  tedious. 
This  may  be  a  prime  reason  why  there  has  not  been  greater 
utilization  of  theoretical  curves  in  practice  to  obtain  a 
description  of  the  distribution  of  a  given  set  of  data. 
With  the  advent  of  high  speed  digital  computers  this 
objection  to  the  heavy  arithmetical  work  involved  is  lessened. 

To  illustrate  the  procedure  involved  in  fitting  a  set 
of  data  to  a  Pearson  frequency  curve,  the  step  by  step 
calculations  for  one  set  of  data,  that  of  permeability  for 
Field  3>  is  shown.   The  calculations  for  fitting  the  remain- 
ing sets  of  data  were  performed  on  a  digital  computer. 
The  computer  programs,  written  in  Fortran  language,  with 
accompanying  flow  charts  are  shown  in  Appendix  B.   For  each 
data  set,  the  histogram  and  plotted  curve  follows  the 
numerical  calculations  for  that  set.   The  normal  curve  has 
also  been  fitted  to  each  set  of  data  to  illustrate  the 
comparison  with  the  non-normal  curve  that  describes  the 
data.   In  many  cases  the  goodness  of  fit  test  for  the  normal 
curve  indicates  that  the  normal  curve  gives  a  poor  fit  for 
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the  data,  which  is  as  one  would  expect  for  some  of  the  more 
markedly  skewed  distributions.   In  a  few  cases  though,  the 
normal  curve  does  give  a  satisfactory  fit  to  the  data, 
even  though  the  normal  fit  is  not  as  good  as  the  frequency 
curve  describing  the  data. 

Procedure 

The  procedure  used  in  fitting  the  various  types  of 
curves  is: 

1.  Arrange  the  data  in  an  array.   Tabulate  the  data  using 
convenient  class  intervals. 

2.  Calculate  the  first  four  moments  about  a  convenient 
vertical. 

p.   Transfer  the  moments  to  the  centroid  vertical  or 

vertical  through  the  mean. 
4.   Apply  Sheppard's  corrections  to  the  moments  if  there 

is  high  contact  at  both  ends  of  the  curve. 

p 
5-   Calculate  a,  , . cu ,  and  6. 

6.  Locate  the  mean  X  and  the  mode  jl. 

7.  Determine  by  use  of  the  (cu  ,  6)  chart  what  type  of 
curve  should  be  used. 

8.  Calculate  the  constants  for  the  equation  of  the  curve. 

9.  Calculate  the  theoretical  frequencies  at  the  mid-point 
of  each  class  interval. 

10.  Plot  the  histogram. 

11.  Plot  the  theoretical  curve  constructing  the  mid-ordinates 
at  the  middle  of  each  class  interval. 
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12.  Calculate  the  area  graduation  for  each  interval.   Test 
for  goodness  of  fit  of  theory  to  observation. 

13.  Fit  the  normal  curve  to  the  data  for  comparison  and 
test  for  goodness  of  fit  of  the  normal  curve. 

Sample  Calculations 

The  permeability  data  for  Field  3  consists  of  195 
observations  obtained  from  seven  core  analyses  from  seven 
wells.   These  data  were  divided  into  14  class  intervals  of 
10  millidarcies  each.   Table  IV  shows  a  convenient  method 
of  calculating  the  first  four  moments  for  the  core  data. 
Following  Table  TV  are  the  calculations  needed  to  fit  the 
frequency  curve  to  the  data. 

The  fitting  of  a  set  of  data  to  a  Pearson  curve  is 

th 
based  on  the  method  of  moments  where  the  r   moment  around 

p 
an  arbitrary  origin  is: 

00 

(3-D  nr»  =  /  xr  of(x) 

—00 

The  zero-th  moment  about  the  origin,  M-  ' ,  always  exists  and 

is  equal  to  one. 

th 
The  r   moment ■ around  the  mean  is: 

CO 

(3-2)  H  =  /   (X  -  H-^)  dF(X) 

—CO 

If  the  moments  around  an  arbitrary  origin  are  known, 
the  following  formula  is  used  to  find  the  moments  about  the 
mean: ^ 
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(5-3)  M.r  =  |J-r'  -3?  M-pli  M^1  + 


4^  n*:2iy2+  ••••  +  (-Dr»vr 


Thus: 


1x1  =  0 

n2  =  n2«  -  ^'2 

[ij    =  uy  -  3^'  M^'  +  2[X1,:5 

For  ease  of  hand  calculation  and  simplicity  of  notation,  ' 
the  scale  for  the  independent  variable  of  the  frequency 
distribution,  (i.e.,  porosity,  permeability,  or  saturation) 
was  transformed  to  a  notation  wherein  the  interval  contain- 
ing the  arbitrarily  chosen  mid-point  is  numbered  zero  and 
intervals  on  either  side  are  numbered  serially.   The 
negative  values  were  assigned  to  that  side  of  the  distribu- 
tion which  contained  the  mode.   This  convention  is  in  keeping 
with  that  adopted  for  the  sign  of  !Ia:!  in  the  derivation  of 
the  Pearson  system  of  frequency  curves.   An  additional 
transformation  of  the  independent  variable  is  made  to  correspond 
to  standard  statistical  notation.   This  transformation  is 
the  introduction  of  the  standard  unit  t.   Thus  when  the 
histogram  intervals  are  numbered  serially, 

(3-4)  t  =  x   „  1 
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where  x.  takes  on  the  values  1,  2,  3 ,  h,    etc.  at  interval 
mid-points. 

The  equation  relating  t  and. the  original  physical 
property,  q,  is: 

(3-5)  q  =  t  a  Aq  +  q 

where  q  is  the  mean  value  of  q  and  Aq  is  the  interval  of  q 
used  to  construct  the  histogram  on  the  q  scale. 

The  following  calculations  starting  with  Table  IV  are 
for  the  permeability  observations  for  Field  3.   For  each 
of  the  other  data  sets  only  the  resulting  constants  and 
equations  are  shown  followed  by  the  graph  of  the  frequency 
curve  and  normal  curve. 

Taking  moments  about  the  arbitrary  mid-ordinate  ^5*0 
in  terms  of  the  transformed  variable  x  with  its  correspond- 
ing mid-point,  x  =  0,  the  following  moments  are  calculated: 

(3-6)  ^i  =  §£  =  ^||  =  -.461538^6 

.2^>      -i  r\Ct 


(3-7)  IV   =  %£  =  if§!  =  6.^615385 

(3-8)  ly   =2^£  =f||  =  131.2000 
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And,  for  moments  about  the  mean, 

M.x  -  0 

jig  -  n2«    -  n^'2  =  6.4515385  -   (-.  46155846 )2 

U2  =  6.2485208 

M^   =  4.9230769  -  3 (-.4615^846) (6. 4615^8)  +  2(-. 46153846) 

u^   =  13.673191 

2  4 

M-4  =  H41    -  4^'   uy  +  6^'     ja2'    -  3^' 

H^  =  148.41116 
And  the  standard  deviation  for  grouped  data, 
(3-9)  a  =v/il2"  =,/6.1b51b75  =  2.4997035 

Hence  Sheppard ' s   corrections  are: 

(3-10)  x2   =  ^2  "  12   =  6-2if85208   -   .083333 

x2  =  6.1651875 

(3-11)  X^   =  1^   =  13.673191 

(3-12)  X^    =  M-J4.    "    2"  ^2  +    2^0 

X^  =  148.41116  -  |(6. 2485208)  +   .029167 

x^  =  145.31607 

Thus,  2 

„  2  .  S_  .  (13-673191)!  .  .7978i26l 
^       Xg-5      (6.1651875)'' 


a        -  H-   -  1^.3i'307  143.^1607  ,    R{«ili7fi 

a4  ~  " 2   ~  ToTT55lB75F  =  38.  0095364  =  3.82;>147o 

p 

2  O).    -  3  cl      -  6 
6  =  2 2 

«4  +  3 

5   =  2(3.8231476)   -3 (.79781261)    -  6 
(3.8231476)  -1-  3 

6   =  -.10950116 

D   =  ou2    -    46(6+2) 

D  =  .79781261   -4(-.  IO95OH6)  (1.8904988) 

D   =  1.6258600 

Then  entering  the  (a-,  ,  6)  chart  with: 

a  2  =  .79781261  and  6  =  -.10950116 
3 

it  is  observed,  that  the  frequency  distribution  of  the 

permeability  for  Field  3  ^ay  be  fitted  by  a  Pearson  Type 

1 

Ln  curve. 

The  equation  of  the  curve  is: 

Y  =  C(t-r1)  -1   (r2-t)  d 

where: 

N  r(m-L  +  m2  4-  2) 
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C  = 


m-I+mp+l 
(r2-r1)  -1  *   r  (m1-J-m2 )  r(m2+l) 


where  N  is  the  sum  of  the  frequency  for  all  class  intervals 

and: 

-cu  +  */D 

P  =  — ^ 

1      2  6 


41 


And: 
C 


r     _--89320367  Wl.b25^oO 
1  2(-. 10950116) 


r±   =  -1.7437633 


r     =  — ^ 
x2 


-ou    -,/D 


2   6 


r2   =  9.9007883 


/l  -   .10950116^        .89320367n       , 
ml   ~       v   -.10950116     n±  "  1.2750919;   " 


n^   =  1.4356194 
m2  =  -(l|i)(i  +  ^.)    _  1 


m2   =  12.829027 


195  r(i. 4356194)  +  12.829027  +  2) 

(9.9007883  +  1.  7437633 )15'25i[bi1'5  T14.  264646  ri3. 829027 

c  = 195  n6.  2646464  . 

(11. 6445516 )15*2°  T14. 264646  T13. 829027 

C  =  2.211844  x  10"12 
The  Gamma  function,,  defined  as: 

(3-12)         T(x)  =  /  e-t  t^x_1^  dt,.  x  >  0 

o 

was  calculated  in  two  ways,  depending  upon  the  value  of  x. 
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For  values  of  x  less  than  9.0,  the  Gamma  function  was  obtained 
from  the  recursion  formula: ^ 

0-13)  •  r(x+i)  =  x  r(x)  • 

which  is  useful  for  expressing  r(x)  as  a  function  of  some 
value  of  r(a)  where  (a)  is  a  value  between  1.  and  2.   The 
value  of  r(a)  was.  determined  by  a  table  look-up  and 
interpolation  using  tabulated  Gamma  functions. 

For  values  of  x  above  9-0,  r(x)  was  computed  from  the 
Stirling  formula'  for  log  r(x). 

log  r(x)  =  -.43429448  (x)  +  (x  -.5)  loS10  x 

0-1^) 

+  iog10  (1  +  1~)   +  .39908993 

The  equation  of  the  curve  is: 
Y  =  (2.  21l844xl0'12)(t+l.  7437633 )1-^56194  (9.90079-t)12*329 

The  range  of  the  curve  can  be  computed  from  the  above 
equation. 

The  range  of  the.  curve  is  defined  as  the  upper  and 
lower  values  of  the  independent  variable  between  which  all 
positive  frequencies  exist.   These  theoretical  limits  are 
determined  by  seeking  the  values  of  t  for  which  Y  =  0.   By 
inspection  of  the  above  equation,  Y  is  zero  when  t  is  -1.74376~33 
and  +9.90070.   These  values  of  t  correspond  to  permeabilities 
of  -2.65  md  and  288.35  respectively  as  determined,  from  the 
equation. 

q  =  q+t  aAq 


43 


where  t  is  one  of  the  limiting  values  given  above,  a   the 
standard  deviation  in  units  of  x,  A  q  is  the  class  interval 
in  units  of  q,  and  q  is  the  mean  value  of  q.   Because  negative 
values-  of  permeability  are  not  physically  meaningful,  the 
limits  can  be  looked  upon  as  zero  and  283.35  millidarcys. 
The  existence  of  the  negative  lower  limit  should  present 
no  more  interpretational  difficulties  than  those  presented 
by  the  known  limits  of  any  normal  frequency  curve  which  are 
by  definition,  -co  and  +00. 

For  calculation  of  graduation  (mid-ordinates),  the 


following  arrangement  is  convenient: 
(1)        (2)  (3) 

Mid -point 


Distance  from 
origin 


=.  (^ ) 


00 

(3)  -  rx 


(5) 


(6) 


log(4J  r2-(3) 


65 


2.46155846 


98475219 


2.7234-955 


43592 


8.9160561 


(7) 
Log(6) 

(8) 
m1x(5) 

(9) 
m2x(7) 

(10) 

(3)+(9)+LogC^LogY__ 

•ft. 

(11) 

(12) 

Area  of 
Interval 

.  95017 

.624807 

12.18975 

I.I6O38 

14.467 

14.512 

The  area  of  the  interval  is  found  by  means  of  Simpson's 

8 

1 
(3-15)        /  f(x)dx  =  1/6  (YQ  +  4Yl/2  +  Y1) 


Rule: 


By  use  of  the  above  calculating  techniques,  the  values 
of  Y„  (G-raduation-mid-ordinates)  and  Graduation  (Areas)  was 
found  for  each  interval.   Additionally  the  values  of  ^novmal 
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(Graduation-mid-ordinates)  v;as  found  for  each  mid-point  by 
substituting  the  appropriate  value  of  t  for  each  mid-point 
into  the  equation:  ,2 

Y  =  CN  e  ~ 


where: 


r  N 


27r  a 


Thus: 


-(.9847)2 
Y<rc  =  -±^-     e  =  19.164 


•65 


2ir   o 


The  following  summary  for  Field  3  permeability  data  is 
presented:   Type  I„  Curve 

Y  =  (2.21184  x  10_12)(t  -I-  1. 74376^3 )1-1]-556l94  (9. 9oo79-t)12-829 

Permeability  „     .  c  Graduation    Graduation  Normal  Curve 

Midpoint (md)   requen  y  (mid-ordinates)    (Areas)  (mid-ordinates) 


5 

15 

14.73 

14.48 

11.43 

15 

29 

29.14 

28.72 

18.53 

25 

34 

33.85 

33.57 

25-75 

35 

32 

31.81 

31.68 

30.41 

45 

24 

26.47 

26.44 

30.60 

55 

20 

20.23 

20.25 

26.23 

65 

16 

14.47 

14.51 

19.16 

75 

11 

9.77 

9.82 

11.93 

85 

0 
0 

6.26 

6.31 

6.33 

95 

2 

3.82 

3.86 

2.86 

105 

1 

2.22 

2.24 

1.10 

115 

1 

1.22 

1.24 

.36 

125 

1 

.64 

.65 

.10 

135 

1 

•?2 

•?2 

.02 

195       194.95        194.09     184.36 
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Goodness  of  Fit 

p 
The  Chi  Square  (X-)  goodness  of  fit  test  for  the  above 

curve  may  be  calculated  by  the  following  method: 

Chi-Square  "Goodness  of  Fit" 


f 

Graduated 

F 

f  -  F 

(f  -  F)2 

(f  -  F)^ 
F 

■15 

14.5 

•  5 

.25 

.013 

• 

29 

28.7 

.3 

.09 

.003 

34 

33.6 

.4 

.16 

.006 

32 

31.7 

.3 

.09 

.003 

24 

26.4 

-2.4 

5.76 

.225 

20 

20.3 

-  .3 

.09 

.003 

16 

14.5 

1.5 

2.25 

.152 

11 

9.8 

1.2 

1.44 

.142 

8 

6.3 

1-7 

2.89 

.  444 

2 

3.8 

-1.9 

3.61 

.893 

1 

2.2 

-1.2 

1.44 

.689 

1 

1.2 

-  .2 

.04 

.046 

1 

.7 

•  3 

.09 

.183 

1 

.? 

.7 

.21 

.700 

194.1  3.522 

The  number  of  degrees  of  freedom  for  this  curve  is  n  - 
1  =  13,  where  n  is  the  number  of  class  intervals.   Entering 

p  Q  p 

a  X  table^  with  13  degrees  of  freedom  and  Y     =  3 • 522  an  ■ 
approximate  value  of  p  =  .99  is  found.   This  indicates 
that  there  are  99  chances  out  of  100  that  differences  as 
large  as  those  found  could  have  arisen  due  to  chance  or 
sampling  variation.   Therefore,  we  can  conclude  that  the 
Type  Xq  gives  an  exceptionally  good  fit  to  the  permeability 

data.   For  testing  the  goodness  of  fit  the  significance 

10 
level  of  p  is  usually  taken  as  either  .05  or  .01. 
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Numerical  Results 

Permeability  (K) —Field  1 

Data  Range    (.2,    439  md.  )  Mean  =  63.26  md. 

Class   interval   =  40  md.  a  =  60.8  md. 

av2   =  3.0327404  6    =  -.2034506  Type  Ij  curve 

rx  =  -.93047735  r2  =  9.4901735 

IT 

>-4 


m1   =   -.3008110  m2  =  6.1312036 


C  =  2.8551007  x  10' 

Y  =  (2.8551007xl0"i!-)(t+.  93047735) '^oo8llo°  (9.  4901735-t)6*  1^>120^6 

Curve  Range  (6.7,  490  md. ) 

I j  Curve  Normal  Curve 

X2  =  17.48  X2  =  968.2 

Fit  is  good  at  .05  level  Fit  is  not  good 


Permeability  Graduation   Graduation  Normal  Curve 

(£dO    Pr6qUenCy  (mid-ordinate)   (Areas)    (mid-ordinate) 


20 

610 

786.1 

600.8 

253.3 

60 

284 

320.9 

329.7  - 

338.7 

100 

189 

174.1 

176.9 

294.6 

140 

101 

97.7 

99.1 

166.7 

180 

71 

54.4 

55.1 

61.3 

220 

25 

29.4 

29.9 

14.7 

260 

13 

15.3 

15.5 

22.8 

300 

5 

7.5 

7.6 

.2 

340 

1 

3.4 

3.5 

.02 

380 

3 

1.4 

1.4 

.0007 

420 

1 

.5 

.5 

.00002 

1303 
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49 
Porosity  (0) — Field  1 

Data  Range  (5.4,  25-5$)       Mean  =  19 .43$ 

Class  interval  =2.5$         a  =  3.65 

a  2  =  .72488482      5  =  -.29374565      Type  IB  curve 

v-L   =  -I.3630479  r2  =  4.2614796 

m1  =  .16531900  m2  =  2.6432928 

C  =  3 . 7594063 

Y=  (3.  7594063)  Ct+1.3630479)1,65:51900   (4.  26l4796-t)2-6432928' 

Curve  Range    (3.83,    24.0$) 

IB  Curve  Normal  Curve 

X2  =  19.841246  /2  =  160.8 

Pit  is  good  at  .02  level  Pit  is  not  good 


Porositv 

w,-rt^i  T?v»««„*v««,r   Graduation   Graduation   Normal  Curve 

Midpoint  Frequency  (mid_ordinate)   (Areas)    (mid-ordinate) 


6.25 

4      ■ 

1.8 

2.1 

.4 

8.75 

3 

10.9 

11.4 

3.3 

11.25 

32 

31.0 

31.5 

18.8 

13.75 

84 

64.0 

64.5 

67.4 

16.25 

90 

110.3 

110.8 

■      151.6 

18.75 

158 

168.0 

168.2 

213.6 

21.25 

278 

228.8 

227.7 

188.8 

23.75 

140 

246.9 

206.6 

104.6 

26.25 

5 

0. 

0. 

36.4 

794 
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Oil  Saturation  (S^)—  Field  1 


o 


Data  Range  (7.0,  73-0^)       Mean  =  38.89^ 

Class  interval  =  5^  a   =  11. 05 

cu2  =  .020572560       6  =  .027215^31       Normal  Type  curve 

C  =  142.75 


t2 


Y  =  142.75  e  2 


Curve  Range  (-00,  00) 


Normal 
)fi   =   16.568630 
Pit  is  good  at  . 25  level 


o  rp  Normal 

Midpoint  {%)  frequency  Mid-ordinates 


c 


8.5  6  3.7 

13.5  20  11.3 

18.5  29  28.3 

23.5  43  57.7 

28.5  84  95.3 

33.5  116  129.7 

38.5  162  143.2 

43.5  141  128.8 

48.5  93  94.5 

53-5  53  56.5 

58.5  20  27.6 

63.5  12  11.0 

68.5  4  3.5 

73.5  _1  -9 
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Water  Saturation  (S  ) — Field  1  " 

Data  Range    (15,    89^)  Mean  =  40.96^ 

Class   interval   =  5$  cr  =  12.15 

o~2   =  .72375019  0    =  -.IO356269  Type  IB  curve 

1^   =  -1.8241159  r2   =  10.038806 

m1   =  1.6619956  m2   =  13.649978 

C    =  9.5915501  x  10"" 15 

Y  =   (9.5915501xl0-13)(t+1.824ll59)1*6619956    (10.938806-t)13-64"78 

Curve  Range  (lS.9,  171) 
Xg  Curve  Normal  Curve 

X2  =  10.875972  X2  =  95.56815 

Fit  is  good  at  .80  level  Fit  is  not  good 

. 

w       r,         Graduation   Graduation  Normal  Curve 
Midpoint  {%)   *reQLuency  Mid-ordinates   (Areas)   Mid-ordinates 


16.5 

8 

0. 

.6 

18.6 

21.5 

27 

35.^- 

36.3 

33.6 

26.5 

89 

103.2 

101.6 

67.3 

31.5 

130 

136.5 

135.1 

99.2 

36.5 

160 

136.8 

136.0 

123.4 

41.5 

136 

117. 8 

117.5 

129.8 

46.5 

76 

91.6 

91.7 

115.2 

51.5 

51 

65.9 

66.1 

86.3 

56.5 

49 

44.3 

44.6 

54.6 

61.5 

32 

28.1 

28.3 

29.2 

66.5 

15 

16.9 

17.0 

13.2 

71.5 

10 

9.6 

9.7 

5.0 

76.5 

6 

5-1 

.      5-2 

1.6 

31. 5 

3 

2.6 

2.7 

.4 

86.5 

2 

1.2 

1.3 

.1 
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Permeability  (K) — Field  2 


Data  Range   (.16,  381.O  md. )  Mean  =  49.8  md. 
Class  interval  =  30  md.       a   =  53.4  md. 
a   2   =  5.437653^-       5  =  -.  2255^678       Type  Ij  curve 
^  =  -.71192903  r2  =  H.O50699 

m1  =  -.58435670  m2  =  5.^516965 

C  =  1.8509951  x  10" ! 

Y  =  (1.8509951x10"^)  (t+.71192903)"*33^5670  (ll.050699-t)5*ii'510965 

Curve  Range  (11. 9*  636.8  md. ) 
Ij  Curve  Normal  Curve 

X2  =  24.96777  X2  =  486.497 

Fit  is  good  at  .02  level  Fit  is  not  good 


Permeability   „  Graduation   Graduation  Normal  Curve 

Mid-point  (md.  )    qu  n  y  (mi^_orc^ina-tes)  .(Areas)    (mid-ordinates) 


15 

295 

1030.9 

.     757.2 

111.6 

^5 

164 

131.0 

139.1 

140.1 

75 

71 

66.8 

68.1 

128.4 

105 

32 

39.^ 

39.8 

85.7 

135 

28 

24.4 

24.6 

41.8 

165 

12 

15.3 

15.5 

14.8 

195 

17 

9.7 

9.8 

3.8 

225 

5 

6.1 

6.1 

.7 

255 

2 

3.8 

3.8 

.1 

285 
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2.2 
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.01 
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.0007 
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57 
Porosity  (<£>) — Field  2 

Data  Range   (8.1,  24.3)       Mean  =  19.72$ 

Class  interval  =  2>i  a   =  2.34^ 

o^.2  =  1.3635387       6  =  .14055615       Type  VIB  curve 

m1  =  4.5483532  m2  =  -22.773043 

r1  =  -1. 969330O  r2  =  -7.7311368 

c  =  20.968745 

Y  =  (20.968745) (t+7. 7311363) -22.773043    (t+i.9698380)4-5488552 

Curve  Range    (-co,    24.33) 
VXq  Curve  Normal   Curve 

X2  =  15.436512  X2  =  71.647549 

Fit  is  good  at  .05  level  Fit  is  not  good 


Porosity   „     .     Graduation    Graduation  Normal  Curve 
Midpoint  ($)  recluen  y  (mid-ordinates)   Areas    (mid-ordinates) 
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2 

1.9 
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.1 

13 

6 
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1.9 

15 

9 
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Oil  Saturation  (S)— Field.  2 

Data  Ranee   (9,  79f°)  Mean  =  49.38^ 

Class  interval  .-  7fo  a   =  14.  7 

a_.2  =  .6911815       5  -  -.305O0OI6       Type  Xg  curve 

r1  =  -1 00OOI93  r2  =  4.0852955 

m1  =  .13792540  m2  =  2.4181582 

C  =  1.7498843 

Y=  (1. 7498843) (t+1.3600193)1*379234  (4.0852955-t)2* ^l8l532 

Curve  Range  (-10.7,  69.4) 
Xo  Curve  Normal  Curve 

X2  =  21.737241  X2  =  90.32 

Fit  is  good  at  .01  level  Fit  is  not  good 


3 

0      /  "^reouencv  Graduation  Graduation     Normal 
Midpoint  ($)  x       °    Mid-ordinate   Areas    (mid-ordinates) 
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Water  Saturation  (S  ) --Field  2 
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Data  Range  (11,  85^) 
Class  interval  =  6% 
a   2   =   1.0579891 
r1   =   -I.I787589 
m1  =  -.12466560 


Mean  =  38.65^ 
a  =   14.82  . 
6  =  -.32958992       Type  Ij  curve 
r2  =  4.2995634 
m2  =  2.1928124 


C  =  1.8860123 

Y  =  (1.8860123)  (t+1. 1787589)"*  122|6656°  (4.2995634-t)2-1928124 

Curve  Range  (21.2,  102.3) 


lj   Curve 
X2  =  7.3586264 
Fit  is  good  at  .75  level 


Normal  Curve 


X  =  98.996253 


Fit  is  not  good 


5 
w         -n,   .       Graduation  Graduation    Normal 
Mid-point  {%)   ±ireclaency  Mid-ordinates   Areas    (mid-ordinates) 
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Porosity  (0) --Field  3 

Data  Range  (10.2,  23.1$)      Mean  =  18.06^ 
Class  interval  =  1.5$        a  =2.46 

a  2  =  .58195^15       <5  =  -.25091835       Type  IB  curve 
r»1  =  -1.526424  r2  =  4.5666935 

n^  =  .4957617  m2  =  3.4749591 

C  =  .11555541 

Y  =  (.11555541)  (t+1.526424)4'957617  (4.5666935-t)3-i+749591 

Curve  Range  (6.7,  21.8) 
1-c  Curve  Normal  Curve 

X2  =  17.752871  X2  =  36.031458 

Pit  is  good  at  .05  level  Pit  is  not  good 


Porosity   „  Graduation  Graduation    Normal 

Midpoint  ($)  *recluency  Mid-ordinates   Areas    (mid-ordinates) 
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Oil  Saturation  (S J --Field  3 


o 


Data  Range  (15. 1,  48.7^)      Mean  =33.26^ 
Class  interval  =3^  o   =  5.7 

a   2   =  .79434377       <5  =  .072484741       Type  VIB  curve 
r1  =  -4.2828677  r2  =  -6.6759047 

n^  =  51.961440  m2  =  -83.553454 

C  =  37-514620 

Y  -  (37.5l4620)(t+6.6759047)"83*55345ii'  (t+4.  2828677) 51, 96l4iK) 

Curve  Range  (-00,  57.7) 
VI-d  Curve  Normal  Curve 

X2  =  10.864505  X2  =  24.761836 

Fit  is  good  at  . 50  level  Fit  is  good  at  .01  level 
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Water  Saturation  (S  ) — Field  3 

w 

Data  Range   (30.4,  76.1)      Mean  =  ^5.35^ 

Class  interval  =5$  a  =  7.85 

a^2  =  1.4631973       <5  =  -.03992642  ■      Type  IILg  curve 

A  =  1.6534022  C  =  13.616174 

2 

Y  =  C(A  +  t)A  _1   e"At 

Y  -  (13.616174)  (1.653^022)1^895°  e-L 655*022  t 

Curve  Range  (32.4,  «>) 
IIId  Curve  Normal  Curve 

■f-   =  4.7735S15  X2  =  16.839006 

Pit  is  good  at  .85  level  Fit  is  good  at  .05  level 
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w  -  Graduation     Graduation  Normal 

Midpoint {%)   ±,recluency  Mid-ordinates       Areas         Mid-ordinates 
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Porosity  (9) — Field  4 


Data  Range  (9.3.  22.4$)       Mean  =  17.8$ 
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a^   =  1.1315972       6  =  .0047189       Type  IIIB  curve 


Class  interval  =  1.5$         a  =  2.0 
2 

A  =  1.8801133  C  =  18.684424 

Y  =  C(A  +  t)A  -1  e"At 

Y  =  (13.684424)  (l.8801133+t)2'53H0  e"1*830113  t 

Curve  Range  (-00,  21.57) 


±11-,  Curve 

Normal  Curve 

X2  =  6.3034068 

X2  =  19.69985 

Fit  is  good. 

at 
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t  is  good  at  .02 
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Oil  Saturation  (S)~ Field  4 
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Data  Range  (13. 3,  68.2^)      Mean  =39.36% 
Class  interval  =  3-5^        o   =  10.1 

a  2  =   .01836461       6  =  -.1317S302       Type  IB  curve 
r1  =  -3. 2859450  r2  =  4.3142724 

m1  =  4.6968277  m2  =  6.4796347 

C  =  1.1959325  x  10"5 

Y  =  (1. 1959325xl0"5)  (t-3.285945)^-6968277  (4.3l42724-t)6'  ^6347 

Curve  Range  (6.9,  83.3) 
It,  Curve  Normal  Curve 

Y2  =  13.992946  X2  =  15.734882 

Pit  is  good  at  .60  level  Fit  is  good  at  .40  level 


3 
0      Trrp  vp-ncv       Graduation  Graduation     Normal 
Midpoint ($)    "^  '   J   Mid-ordinates   Areas    (mid-ordinates) 
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Water  Saturation  (S  ) — Field  4 

Data  Range  (16.6,  76.8^)      Mean  =  39. 8l# 

Class  interval  =6%  a   =  13.5 

a  2  =  .44508784       6  =  -. 41374506       Type  I-  curve 

r^  =  -I.3H2949  r2  =  2.9237671 

m1  =  -.122539  m2  =  .9564568 

C  =  10.194623 

Y  =  (10. 194628)  (t-!-l. 31129^-9)"- 122559  (2.9237671-t)'9564558 

Curve  Range  (22.1,  79.2) 

I.q  Curve  Normal  Curve 

h 

X2  =  15.406190  X2  =  28.033^2 

Fit  is  good  at  . 05  level  Fit  is  not  good 
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vi       „  Graduation  Graduation    Normal 

Midpoint (%)  qu  '  y  Mid-ordinates   Areas    Mid-ordinates 
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75 


Data  Range  (3.4,  3.4.3#) 


Class  interval  =  3$ 
a72  =  .33849171 


Mean  =  20.66^ 
o  =   4.32 

5  =  . 25960502       Type  IV  curve 
m  =  5.852OO56 

s  =  2.7291720 


r  =  1.1205492 

v  =  -3-9842932 

C  =  3.3495724  x  107 

Y  =  (3.3405724xl07)[(t-M.1205492)2  +  (2. 7291720)2]"5'8520056  x 

3  98^2932  tan"1  (t+1- 1205^92 } 
^p.jo.d.jjd    oan    ^  2.7291720  ; 


Curve  Range  (-00,  00) 


IV  Curve 


X~  =  83.10963 


Pit  is  not  good 


Normal  Curve 
X2  =  109.93934 
Fit  is  not  good 


„   n   .    Graduation  Graduation     Normal 

Areas     Mid-ordinates 
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Mid - or dinat  e  s 
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Data  Range    (5-7,    77.:"  Mean  =  32.77^ 

Class   interval   =8$  a  =  13.84 


a7~   =   .OOO47316776  6   =  -.11485628  Type  IB  curve 

r.    =  -3.9530723  r2   =  4.1474939 

m1   =  6.5279175  m2  =  6.8881809 

C   =  2.57H236  x  10"6 

Y  =   (2.5711236xlO'6)(t+3.95S0728)6*527S175   (4. l474939-t)6'888l8°9 

Curve  Range      (-21. 9 ,    90.1) 
IB  Curve  Normal   Curve 

X2  =  16.2566686  X2  =  17.879294 

Fit  is  good  at  .05  level  Fit  is  good  at  .04  level 
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Data  Range  (12.4,  92. 

> 

Mean  52.17 

Class  interval  8^ 

a  =  16.0$ 

a_2  =  .-000611895 

6  = 

=  -.02569023 

C  =  244.82 

.2 

c 

rr     0)1)1   Qo 

Normal  Type  curve 


Normal 


Curve  Range  (-co,  <») 


X-  =  14.559695 

Pit  is  good  at  . 20  level 
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Water  Saturation  (S  ) — Field  5 
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Data  Range  (12.4,  92. 


Class  interval  8$ 

a_2  =  .- 000611895 
,? 

C  =  244.82 

_  tf 

Y  =  244.82   e. 


Mean  52.17 
a  =  l6.0;i 
6  =  -.02569028 


Curve  Range  (-co,  00) 


Normal 


,p 


X-  =  14.359695 
Fit  is  good  at  . 20  level 
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Summary 

A  concise  and  easily  visualized  summary  of  the  distri- 
"butions  of  the  data  may  be  shown  by  means  of  the  (a.,  ,  6) 
chart.   Figure  2  is  the  complete  (cu    ,  6)  chart;  however, 
for  purposes  of  showing  the  distributions  of  the  properties 
under  consideration,  only  the  necessary  portion  of  the  chart 
has  been  prepared  for  each  of  the  field  properties. 

Figure  21  shows  the  distributions  for  porosity  for  the 
five  fields.   There  appears  to  be  no  pattern  to  the  types 
of  distributions,  in  that  Fields  1  and  5  are  of  the  Type  IB, 
Field  4  is  a  Type  Illg,  Field  2  is  a  Type  VXg  and  Field  5 
is  a  Type  IV. 


—  .3 


—  .2 


.3»- 


■a 


>m 


a 


FIGURE  21  (cu2,  6)  Chart  of  Porosity  (Normal  Curve  is  Pt.  0.0) 
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Of  the  five  fields,  only  Field  k   satisfied  the  goodness 

of  fit  test  for  the  normal  curve.   This  is  somewhat  in 

contrast  to  the  conclusion  of  Jan  Law   that: 

"With  some  exceptions  permeability  and  porosity 
assemblages  give  respectively  satisfactory  logarithmic 
and  arithmetic  normal  frequency  distributions. " 

For  the  present  study  it  was  the  exception  where 
porosity  data  gave  a  satisfactory  arithmetic  normal  frequency 
distribution. 

The  (a.,  ,  6)  chart  for  the  S  data,  Figure  22,  indicates 
a  somewhat  different  situation  regarding  normal  distributions. 
Four  of  the  five  fields,  1,  3,  k,    5,  satisfied  the  goodness 
of  fit  test  for  normal  distributions. 
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FIGURE  22 


(c^2,  6) 


Chart  of  S 


85 


All  five  fields  for  S  were  fitted  by  a  Type  I  curve 

p 
with  varying  values  of  ot,  and  6.   Field  5  satisfied  the 

goodness  of  fit  test  for  the  normal  curve. 


—  .4  - 
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FIGURE 


23  (cu2,    6)  Chart  of  Sw 


It  is  noted  that  both  the  S^  and  S  distributions  for 

o      w 

Field  5  which  has  a  considerably  greater  number  of  samples 
included,  approximately.  1700,  than  any  of  the  other  fields 
satisfies  the  goodness  of  fit  for  the  normal  curve.   This 
could  indicate  "that  if  large  enough  samples  are  taken,  the 
S  and  S  distributions  approach  normality.   For  sampling 
considerations,  the  assumption  that  S  and  S  populations 
are  normally  distributed  would  permit  many  established 
techniques  to  be  applied  to  the  analysis  of  these  properties. 

A  study  was  made  of  what  effect  the  choice  of  interval 

p 
size  would  have  on  the  a,  and  6  values.   For  each  set  of 

data  various  interval  widths  were  used,' i.e.,  for  porosity 


p 

Field  3 >  and  the  calculated  values  of  a_  and  6  were  more 
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1  percent,  1.5  percent,  2  percent,  and  3  percent  interval 

p 
widths,  to  calculate  the  parameters  ql,  and  6.   For  all 

fields  with  the  exception  of  Field  3>  no  appreciable  dif- 

2 
ference  was  noted  In  the  values  of  ol.  and  6  for  the  various 

interval  sizes  tried.   There  were  only  132  observations  for 

S 

2 

sensitive  to  the  size  of  interval  used,  with  ol,  ranging 

from  .641  to  .279  and  6  ranging  from  -.249  to  -.099  as  the 
interval  size  was  varied  from  1  to  2.5  percent  in  steps  of 
. 5  percent. 

It  was  concluded  from  this  study  that  with  a  sufficiently 
large  number  of  observations  (>  200),  a  width  chosen  to  give 

between  7  to  15  intervals  will  give  consistent  results  in  the 

2 
calculation  of  ou     and  6. 

Although  it  was  not  possible  to  study  enough  fields  to 

determine  the  relation  between  reservoir  type  and  the 

2 

parameters  ou   and  6  herein  measured,  it  would  be  reasonable 

to  assume  that  such  a  situation  exists.   Specifically  the 

2 
parameters  cu  and  6  would  be  expected  to  reflect  the 
3 

variation  of  individual  properties  throughout  a  single 
depositional  unit.   They  should  also  characterize  the  deposi- 

tional  unit,  distinguishing  it  from  similar  units  in  a  geologic 

2 
basin  or  province.   Within  this  context,  the  variables  cl 

and  6  could  represent  properties  which  could  be  correlated 

in  the  geologic  sense  for  determining  stratigraphic  equality. 

If  data  were  available  in  sufficient  quantity  to  yield 

2 
CL-.     and  6  parameters  for  individual  wells,  then  these 
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variables  could  be  contoured  for  specific  reservoirs  or 
used  to  segment  the  reservoir  into  smaller  more  homogeneous 
units  for  mathematical  analysis.   The  values  themselves 
would  then  provide  the  information  necessary  for  preparation 
of  a  mathematical  model  of  the  reservoir  including  the 
effects  of  heterogeneity. 

Thus  in  reservoir  models  there  is  no  "a  priori"  reason 
why  the  Gaussian  normal  distribution  function  need  be  used 
to  study  the  reservoir  performance.   If  experimental  evidence 
indicates  the  existence  of  a  non-normal  Pearson  type 
distribution  function  for  a  particular  property  then  this 
specific  distribution  function  could  be  used. 

However  before  quantitative  use  is  made  of  the 

p 

statistics  ou  and  6,  it  should  be  emphasized  that  these 

apply  only  to  the  set  of  data  from  which  they  were  extracted. 

p 
They  are  only  estimates  of  the  parameters  cu  and  6  where 

the  subscript  r  denotes  the  value  of  parameters  for  the 

population  of  samples  which  comprise  the  entire  reservoir. 

For  the  normal  distribution  which  is  described  by  the  mean 

and  standard  deviation,  techniques  are  well  known  that 

employ  these  sample  statistics,  X  and  a,  to  estimate 

population  parameters.   However  very  little  information  is 

2 
available  for  estimating  population  parameters  ou  and  6 

2 
from  the  sample  statistics  ou  and  6. 


CHAPTER  IV 
ANALYSIS  OF  LOGRITHMS  OP  DATA 


Introduction 


Jan  Law  in  his  work  with  core  data  concluded  that 
"with  some  exceptions  permeability. .. .assemblages  give 
satisfactory  logrithmic. . . .normal  frequency  distributions. " 
To  investigate  the  applicability  of  Jan  Law's  conclusion  to 
the  core  data  available  for  the  fields  under  study  is  the 
purpose  of  this  chapter. 

It  can  be  observed  from  Figures  4  and  8  that  the 
frequency  distributions  of  the  permeability  of  Fields  1 
and  2  are  markedly  skewed.   The  distributions  are  of  the 
Pearson  Type  Ij.   Does  the  distribution  of  the  logrithms  of 
these  data  approximate  the  normal  curve? 

Logrithmic  Distributions 

If  the  distributions  of  these  data  do  satisfy  a 
logrithmic  normal  frequency  distribution,  an  accumulated 

frequency  curve  of  the  permeability  data  plotted  on  logrithmic 

p 
probability  paper  should  follow  a  straight  line. 

The  sample  data  may  be  cumulated  and  put  in  percentage 
form  as  in  Tables  V  and  VI.   These  cumulative  percentages 
may  then  be  plotted  on  logrithmic  probability  paper.   If 
the  resulting  curve  is  approximately  a  straight  line,  we  may 
proceed  with  assurance  to  fit  a  normal  curve  to  the  logrithms 
of  the  data. 

Three  cycle  logrithmic  probability  paper  was  used  to 
plot  the  cumulative  percentages  from  Tables  V  and  VI. 


less  than       l.So 

130 

2.89 

195 

6.25 

260 

11.00 

325 

16.00 

.390 

24.00 

455 

30.00 

520 

37.00 

585 

45.00 

651 

53.00 

715 

65.OO 

780 

71.00 

845 

84.00 

910 

97.00 

975 

111.00 

1040 

126.50 

1105 

151.50 

1170 

185.00 

1235 

439.00 

1303 
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TABLE  V  Cumulative  Distribution  of  Permeability  for  Field  1 

Permeability  in  md.   Number  of  Measurements  Percent  of  Total 

10 

15 
20 

25 
30 

35 
40 

^5 
50 

55 
60 

65 
70 

75 
80 

85 
90 

95 
100 

TABLE  VI  Cumulative  Distribution  of  Permeability  for  Field  2 

Permeability  in  md.   Number  of  Measurements  Percent  of  Total 

less  than   1.1  31  5 

1.8  63  10 

3.7  94  15 

7.0  126  20 

10.0  158  25 

15.0  189  30 

20.0  220  35 

24.9  252  40 

29.0  283  45 

32.0  317  50 

36.O  3^6  55 

42.0  378  60 

48.0  409  65 

53 . 0  441  70 

63 . 0  472  75 

73 . 0  504  80 

90.0  535  85 

122.8  572  90 

175.0  603  95 

380.0  635  100 
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The  resultant  curves  obtained  for  the  two  fields  are 
shown  in  Figures  24  and  25  respectively.   Prom  visual 
observation  it  appears  that  the  two  curves  do  not  approxi- 
mate straight  lines.   Therefore  it  may  be  concluded  that 
the  permeability  data  for  these  fields  do  not  follow 
satisfactorily  the  logrithmic  normal  distribution. 

The  question  that  now  arises  is  if  the  logrithms  of  the 
data  are  not  normally  distributed,  what  distribution  pattern 
do  they  follow?  To  answer  this  question,  the  permeability 
data  were  converted  to  logrithms. 

The  logrithms  of  the  permeability  of  the  two  fields 
were  then  treated  as  data  to  which  a  Pearson  frequency 
curve  was  fitted.   The  calculated  results  using  the  logrithms 
are  shown  on  the  folloi-zing  pages,  with  the  calculations  and 
figures  for  Fields  1  and  2  respectively  shown. 

Fitting  the  Logrithmic  Data  to  Pearson  Curves 

The  procedure  outlined  in  Chapter  III  was  used  to  fit 
the  logrithms  of  the  permeability  of  Fields  1  and  2  to  a 
Pearson  type  curve.   The  curves  obtained  are  shown  in  F.igures 
26  and  27  and  were  found  to  be  Type  Ij  and  Lg.   Additionally 
a  normal  curve  fitted  to  the  histogram  is  shown  for  compara- 
tive purposes. 

To  simplify  the  interpretation  of  Figures  26  and  27,  the 
abscissa  of  the  curves  were  converted  to  an  interval  scale 
which  is  denoted  as  a  ir  scale.   Tables  VII  and  VIII  show  the 
logrithm  value  and  the  millidarcy  value  of  each  interval  of 
the  abscissa. 
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Data  Range  (-1.30103,  2.64246)  Mean,    =  1.4288151 


Class  Interval  =  *30000 
2 


a  =  .7^250 
6   =  -.47453376  Type  Ij  Curve 

r2   =  3.0271256 
m2  =  .63950490 


a  ~  =  .86963292 
r±  =  -I.0619523 
m1  =  -.42484190 
C   =  7.5824862 

Y  =  7.  5824862 (t+l. 0619523 )-^84l9°   (3.0271256-t)  -^950490 

RangelQr,   (-1.17381,    2.21681) 
I j  Curve  Normal   Curve 

X2  =  30.534208  X2  =  428.38022 

Pit  is  good  at  .001  level  Pit  is  not  good 


ir   Scale  _    ,~nP,r  Graduation     Graduation   Normal  Curve 
Midpoint  *recluency  Mid-ordinates    Areas     Mid-ordinates 


1.5 

17 

22.8 

22.6 

6.1 

2.5 

53 

38.5 

38.5 

16.4 

3.5 

59 

53.3 

53-3 

37.6 

4.5 

74 

68.6 

68.7 

73.3 

5.5 

57 

85.5 

85.6 

121.3 

6.5 

89 

105.4 

IO5.6 

170.5 

7.5 

132 

130.6 

131.0 

203.8 

8.5 

209 

166.7 

I67.8 

206.9 

9.5 

297 

231.7 

236.6 

178.6 

10.5 

266 

308.8 

289.1 

130.9 

11.5 

49 
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0.0 
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43.2 
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Calculations  log  K — Field  2 

Data  Range,    (-1.2040,  2.58OO)  Mea^   =  1.373706 
Class  Intervals  =  .4000        a   =  .6920 
a  2  =  .625648       6  =  -.2240993       Type  IB  Curve 
rx  =  1.5577193  r2  =  5.0873H3 

m1  =  .62325930  m2  =  4.3013559 

C  =  8.9991354  x  10~2 

Y=  (8. 9991354xl0"2)(t+l.  5577193 )*6232593°  (5.0873713-t)4*;501:5559 

Rangelog  (-2.14621,  2.473790) 
Xq  Curve  Normal  Curve 

X2  =  24.989362  Y2   =  108.47524 

Fit  is  good  at  .  005  level  Fit  is  not  good 


ir   Scale   -  Graduation  Graduation   Normal  Curve 

Mid-point  xTequency   Mid-ordinates    Areas    (mid-ordinates) 
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17.7 

18.0 
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3.5 

52 
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4.5 

37 
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78.1 

5.5 

76 
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96.2 

126.4 

6.5 

163 

131.7 

131.2 

146.5 

7.5 

178 
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1-53.1 
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8.5 
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130.1 
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Summary 

From  the  cumulative  frequency  curves  and  the  logrithmic 
frequency  distribution  curves  it  may  be  concluded  that  the 
permeability  data  under  study  does  not  satisfactorily  fit  a 
logrithmic  normal  curve.   It  is  not  the  intent  of  this  study 
to  dispute  Jan  Law's  conclusions  that  the  logrithms  of  permea- 
bility data  approximate  a  normal  curve,  but  more  to  express 
a  word  of  caution  in  indiscrimately  using  the  logrithmic 
normal  curve  to  analyze  the  permeability  distribution  of  a 
particular  set  of  data.   The  data  used  by  Jan  Law  came  from 
a  different  geographical  area  and  from  fields  with  different 
depositional  properties  than  the  data  for  this  study. 
Therefore  it  appears  that  each  and  every  set  of  data  must 
be  analyzed  to  determine  its  applicability  to  the  logrithmic 
normal  curve. 

Logrithms  do  provide  a  means  to  remove  some  skewness 
from  a  set  of  data,  but  not  necessarily  to  convert  it  to  a 
normal  distribution.   If  sampling  techniques  were  developed 
for  Pearson's  Type  Xg  distributions,  the  conversion  of 
permeability  data  to  logrithms  for  analysis  could  serve  a 
very  useful  purpose.   At  present  though,  this  study  indicates 
that  caution  must  be  exercised  in  placing  too  great  of  a 
degree  of  reliability  on  conclusions  drawn  from  the  analysis 
of  the  distributions  of  the  logrithms  of  permeability  data. 


CHAPTER  V 
ANALYSIS  OF  VARIANCE 


Introduction 


Prom  each  core  analysis  there  may  be  readily  obtained 

X 


a  sample  mean  (X)  and  a  sample  variance  (S„  )  where: 


ZX 
C5-D  Wl=^T 

and,  0 

0      z(x  -  xr 
(5-2)  s/  -  -±L 

where  X  =  individual  measurements  within  each  well  sample 
n  =  number  of  samples. 

Within  a  field  the  various  means  obtained  from  the 
different  wells  most  likely  will  not  be  numerically  identical, 
It  is  desired  to  determine  whether  or  not  the  difference 
between  the  various  means,  X-.,    Xp  ....  X  can  be  explained 
by  random  errors,  that  is,  the  means  are  statistically  the 
same  and  represent  identical  populations,  or  whether  the 
means  are  actually  different  where  the  differences  do  not 
result  from  sampling  errors. 

To  illustrate  the  techniques  involved  in  making  these 
determinations,  the  data  obtained  from  the  core  analysis  for 
porosity  of  Field  2  will  be  used.   From  this  data  the  follow- 
ing sample  means  and  sample  variances  were  calculated  for 
the  wells  cored: 
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Well 

Number  of 
Measurements 

Mean  (X0) 

Variance  (Sx  ) 

0 

1 

24 

.199 

. 000198 

2 

28 

.189 

.000202 

3: 

26  • 

.200 

. 000128 

h. 

27 

.205 

.000255 

5 

18 

.199 

. 000263 

6 

36 

.136 

. 000890 

7 

14 

.184 

. 000013 

8 

23 

.208 

.000174 

9 

34 

.191 

.000607 

10 

20 

.190 

.000223 

11 

31 

.203 

.000402 

12 

W 

.208 

.000894 

13 

15 

.131 

. 000946 

14 

?o. 

.212 

.000312 

345 


2.775 


From  the  above  it  can  be  observed  that  the  means  range 
from  .181  to  .212.   The  question  which  arises,  is  whether 
ordinary  random  sampling  errors  account  for  the  differences 
in  these  means,  or  may  it  be  concluded  that  the  means  are 
different  because  of  reasons  other  than  sampling  fluctuations? 


The  F  Test 

The  F  test  devised  by  R.  A.  Fisher  and  named  in  his 
honor  is  a  means  for  answering  the  above  question.   The 
basis  for  this  test  lies  in  the  availability  of  two  independent 
estimates  of  the  population  variance.   Consider  a  single 

core  analysis  ¥10  consisting  of  all  of  the  porosity  measure- 

2 

ments  from  Well  1.   The  variance  EL-  .  of  that  set  is  a 

measure  of  the  internal  scatter  of  the  measurements  of  the 
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2      2  2 

population.   Similar  statements  apply  to  Sw? ,,    S-~ ,    ....  S,,n0. 

However,  these  variances  are  influenced  by  sampling  error 
and  are  not  necessarily  equal  to  the  population  variance. 
A  better  estimate  of  the  internal  scatter  of  the  measurements 
may  be  obtained  by  pooling  the  individual-sample  variances. 
The  better  estimate  of  the  within-sample  or  error  variance 

U   2 

is  given  by: 

h-i     p   n„     p        n.     P 

fC  ^    c  2     z    txwi~xwi)    +2    fowgj^gj    + —  z    ^jW^c: 

(5-3)  Se  =  

(^  +  n2  ....  +  n^)  -  n± 

where  n.  denotes  the  number  of  wells  and  nv, ,  n?  ....  n. 
denotes  the  number  of  measurements  in  wells  Wl,  W2  . .'. .  Wk. 

The  numerator  of  equation  (5-3)  is  called  the  "within- 

o 
groups  sum  of  the  squares  ,  and  Se  may  be  called  the  mean 

square  of  individual  measurements. 

2     2 
The  degrees  of  freedom  associated  with  Se  are: 

ni 
(5-4)  f e  =  S   (nk)±  -  n± 

where  n.  denotes  the  number  of  wells  and  (nk) .  denotes  the 
number  of  measurements  in  the  ith  well,  or  fe  =  the  total 
number  of  measurements  in  all  wells  less  the  number  of  wells. 
Another  independent  estimate  of  the  population  variance 
may  be  obtained  from  the  sample  means.   The  variance  of  the 
population  of  means  is:-^ 


(5-5)  S. 


ni 


m,p      n±   -  1 


100 

where: 

(5_6)  ^   _  ZXW1  +  XW2  + XWk 


p    Number  of  Wells 

However,  the  variance  of  the  population  of  means  is  not  in 
itself  equal  to  the  population  variance.   If  all  samples 

were  of  the  same  size,  the  second  estimate  of  the  population 

■5 
variance  would  be:^ 

(5-7)  SP2  =  nk  s2m,P 

If  the  samples  are  of  different  size,  n.,  /  n?  }£  n.  , 

which  is  usually  the  case  in  core  analysis,  a  pooled  result 

■5 
is  to  be  used  as  an  average  sample  size. 

Therefore:^ 
(5-9)  Sp2  =  nQ  S2  m,p 

when  S   is  estimated  by  either  equation  (5-7)  or  (5-9)* 

p 

n.-l  degrees  of  freedom  are  involved.   S   is  often  referred 

to  as  the  mean  square  of  sample  means. 

If  the  random  factors  which  give  rise  to  the  within- 

p 
sample  or  error  variance  Se  are  the  only  factors  causing 

the  differences  between  the  sample  means,  the  two  independent 

2       2 
variance  estimates  Se  .and  S   should,  except  for  sampling 

error,  be  equal.   The  probability  of  particular  rations  of 

2      2 
S   to  Se  has  been  computed  by  Fisher.   This  distribution 
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2i 

is  the  well  known  F  distribution  which  is  a  function  of  the 

P       P 
degrees  of  freedom  associated  with  both  S   and  Se  .   There- 

P 

fore  the  ratio,  P,  0 

(5-10)  p  ■*  J2- 

Se^ 

is  a  measure  of  whether  or  not  random  sampling  error  can 
account  for  the  observed  differences  between  sample  means. 

By  the  use  of  the  P  table  and  the  degrees  of  freedom  associated 

2       2 
with  S   and  Se  ,  a  determination  of  whether  random  sampling 

errors  could  account  for  the  differences  between  the  means 

can  be  made. 

Conducting  the  P  test  on  the  porosity  core  analysis 

results  for  Pield  2  gives: 

(5.3)    Se2   ^frm-1^)2  +  ^Xs-189)2  + 


(24+28 )  -  14 


Se2  =  .0004559 


ni^  •  ^2 


,    ,  2      S   (Xi"V 


(k  £\  v    .  199  ±  .  189  +  .200 ±  .  212 

(5-6)        Xp  =  pj 


(r  r\    q2     -  (.199-.197)2  ±  (.189-. 197)2  (.212-. 187) 

KO  5)   ■£>  m^p  -  14  -  1 

S2  „  =  9.128  x  10"5 
m,p 
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(5-9)  Sp2  =  nQ  S2^p 

(5_8)        n     =  ^(345  -  2  2k2+  2Q2+  262  ""   +  ?°'' 
±*~1  2   24     +28+26      +  30 


nQ  -  24.507 


Therefore: 


S„2  =  (24. 507) (9. 128  x  10"5)  =  .0022^7 

Jr 

(5-10)  F  =  !e!  =  ^02227 

Se^   .000456 

F  =  4.907 

o 
Since  S   is  associated  with  14  -  1  =  13  degrees  of 

o 

freedom  and  Se  with  3 45  -  14  =  331  degrees  of  freedom,  the 
F  table  gives  F  =1.76  at  the  5  percent  confidence  level, 

F  =  2.14  at  the  1  percent  confidence  level. 

The  F  value  of  2.14  at  the  1  percent  level  indicates 
that  there  is  only  1  chance  out  of  100  that  the  observed 
differences  between  means  can  be  explained  by  random  errors 
if  the  calculated  F  exceeds  2.14.   In  this  case  4.907  >  2.14, 
therefore  it  is  reasonable  to  conclude  that  the  populations 
represented  by  the  sample  means  are  actually  different  and 
the  difference  cannot  be  explained  by  random  variation. 

Since  there  is  an  actual  difference  between  the  14 
sample  means,  further  testing  must  now  be  conducted  to 
attempt  to  determine  if  there  are  sub-groupings  of  these 
means  that  are  statistically  homogeneous.   For  this  field, 
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the  wells  were  lumped  together  by  leases  and  an  P  test 
conducted  to  determine  if  the  means  of  the  three  leases 
were  statistically  the  same  or  whether  they  were  different. 


Field  2 

Lease 

Number  of 
Measurements 

Mean  (X0) 

Variance  (S^.) 

1 

179 

.  1948 

. OOO383 

2 

71 

.1950 

.OOO67O 

3 

95 
345 

.2031 

.000641 

For  leases: 

Se^  =  .000512 

s  2  =  .002368 

s  2 

F  =-£5.  =  4.62 


Degrees  of  freedom  with  Se  =  345  -  3  =  342 

p 
Degrees  of  freedom  with  S_  =3-1=2 


P 


The  F  table  gives: 


F  =3.03  at  the  5  percent  confidence  level 
F  =  4.69  at  the  1  percent  confidence  level 

Since  the  calculated  F,  4.62  <  4.69,  the  observed 
difference  in  the  three  leases  means  could  reasonably  be 
explained  on  the  basis  of  the  scatter  of  the  observed  data, 
and  the  three  means  are  statistically  the  same. 

It  may  be  concluded  from  the  two  F  tests  that  the 
individual  well  means  are  statistically  different,  but  the 
lease  means  are  statistically  the  same.   The  well  means  each 
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represent  individual  points  in  the  field  and  since  these 
points  are  different,  heterogeneity  exists  between  the 
various  wells.   As  the  wells  are  lumped  together  to  form  a 
lease  mean  the  heterogeneity  of  the  individual  points  are 
absorbed  into  larger  somewhat  homogeneous  units,  these 
larger  units  exhibiting  similar  statistical  characteristics. 

Modified  Tukey  Test 

The  results  of  the  F  test  on  the  means  of  the  14  wells 
gave  convincing  evidence  of  differences  among  the  means, 
but  the  F  test  gave  no  clue  as  to  how  many  differences  there 
were.   In  a  group  of  a  means  there  are  in  all  a(a-l)/2 
potential  differences;  l4(l3)/2  =  91  among  the  wells.   Does 
each  mean  differ  from  all  the  rest,  or  are  some  of  them 
the  same? 

One  method  of  investigating  the  differences  is  by  the 
Tukey  test  (modified)-3.   The  test  is  made  by  computing  a 
difference,  D,  which  is  significant  at  the  5  percent  level, 
then  comparing  it  with  the  a(a-l)/2  sample  differences.   D 
is  the  product  of  S^  and  a  factor,  Q,  taken  from  a  Q  table 
which  is  itself  computed  on  the  basis  of  the  distribution  of 
the  deviations  among  compared  means. 

(5-H)  SY  =  "^ 

and: 

(5-12)  D  =  „Q  S^ 
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To  determine  Q  enter  the  upper  heading  of  the  Q,  table 
with  the  number  of  treatments  (14  wells)  and  degrees  of 
freedom,  f,  for  samples  (331)  indicated  at  the  left  of  the 
table.   Prom  the  table  with  14  treatments  and  f  =  331*  0. 
is  4.74. 

For  porosity,  Field  2, 


and, 


S^2  =  1.86  x  10"5 


S*p  =  4.313  x  10"5 


Therefore: 

D  =  (4.74)  (4.313  x  10""3)  =  .02048 

The  differences  to  be  compared  with  D  are  shown  in 
Table  IX. 

In  Table  IX  the  X ,  are  arranged  from  high  to  low  and  each 
is  subtracted  from  those  above.   Of  the  91  differences,  only 
15,  indicated  by  *,  exceed  D  =  .0205.   One  inference  from 
the  table  is  that  the  population  represented  by  the  mean  of 
well  13  is  different  from  that  of  wells  14,  8,  12,  4  and  11. 
Similar  statements  may  be  made  about  any  particular  well.   For 
instance  it  may  be  inferred  that  the  population  represented 
by  the  mean  of  well  14  is  different  from  that  of  wells  13,  7, 
6,  2,  10,  and  9  but  the  differences  between  well  14  and  the 
remaining  wells  are  not  significant.   This  above  procedure  is 
therefore  useful  for  seeking  homogeneities  and  dissimilarities 
among  wells.   Such  information  could  be  valuable  in  explaining 
similarities  and  differences  in  individual  well  production 
behavior. 
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Sequential  Method  of  Testing 

A  sequential  method  of  testing  the  differences  between 

7 
the  means  devised  by  Hartley  is  a  somewhat  more  powerful 

testing  procedure  than  the  modified  Tukey  method.   For  this 

test,  not  one  Q  is  taken  from  the  Q,  table  but  several,  one 

for  each  range  of  the  treatment  means.   For  the  well  means, 

adjacent  means  in  the  array  are  tested  with  Q  =  2.77  for  a  = 

2;  for  two  ranks  apart  in  the  array  use  Q,  =  3.32  for  a  =  3; 

for  three  ranks  apart  in  the  array  use  Q  =  3.63  for  a  =  4; 

use  Q,  =  4.74  only  for  the  extreme  range  where  a  =  14.   The 

corresponding  D  are: 

a  Q  D 

2  2.77  .012 

3  3-32  .014 

4  3 .  63  .016 

5  3.36  .017 

6  4.03  .0174 

7  4.17  .018 

8  4.29  .0185 

9  4.39  .0189 

10  4.47  .0193 

11  4.55  .0196 

12  4.62  .0201 

13  4.68  .0202 

14  4.74  .0205 

These  D  are  entered  in  the  northeast-southwest  diagonals 
of  the  table  of  differences,  Table  X;  with  the  D's  in 
parenthesis.   Each  difference  is  now  compared  with  its  own  D, 
the  difference  being  judged  significant  if  it  is  larger  than 
its  D.   A  useful  rule  to  be  observed  is  that  if  any  difference 
is  less  than  its  D  then  no  further  testing  needs  to  be  done 
to  the  right  of  that  difference  in  its  row  or  below  it  in 
the  column. 
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Prom  Table  X,  it  is  observed  that  there  are  18  signi- 
ficant differences,  marked  by  an  *,  which  signifies  that  the 
sequential  method  detects  a  greater  number  of  differences 
than  the  modified  Tukey  method. 

From  the  table  the  wells  similar  to  any  given  well  may 
be  detected.   For  example,  it  may  be  stated  that  the  mean  of 
well  9  is  not  significantly  different  from  any  of  the  other 
means  except  well  14.   The  mean  porosity  of  well  3  is  not 
significantly  different  from  that  of  any  other  mean  except 
for  well  13. 

The  sequential  testing  method  then  provides  a  comparison 
mechanism  for  determining  the  relationship  between  any  well 
mean  desired  with  the  other  well  means  in  the  group. 

Homogeneity  of  Variances 

Several  tests  are  available  for  testing  the  homogeneity 
of  the  variances  of  the  several  samples.   The  appropriate 
test  depends  upon  the  type  of  possible  variation  of  the 
variances  that  is  visualized.   If  the  deviation  from  equality, 
i.e.,  (a,   =  ap  =  ....=  cr.  )  is  conceived  to  be  caused  by 
a  random  variation  then  an  appropriate  variance  homogeneity 

o 

test  is  Bartlett's  test  wherein  the  statistics  B  and  C  are 


computed  where: 


B  =  \{y   In  S2  -  S  v±  In  S^) 


(5-13) 

=  2'^9(v  log1Q  S2  -  2  v±  log10  Si2) 


(5-14) 

where: 


C  =  1  + 


2(— )  -  - 
vv.  '   v 

3  (K-l) 
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2   S  Vi  Si' 

v  =  2  v.  and  S^  = — ±- 

1  v 


vi  is  the  degrees  of  freedom  associated  with  each  sample 

2 
variance  S.   and  K  is  the  number  of  variances  being  con- 
sidered. 

The  statistic  B  is  known  to  satisfactorily  approximate 
the  Chi-square  (X  )  distribution  corresponding  to  K-l  degrees 
of  freedom.   A  calculated  B  value  greater  than  the  )C     value 

at  the  given  degrees  of  freedom  and  confidence  coefficient 

2     2 
is  evidence  for  rejecting  the  hypothesis  a,   -  a?  =....= 

2 
a.     .   Calculations  for  porosities  of  Field  2  are  given  below 

to  illustrate  the  method. 


Well 
No. 


TABLE  XI 

Computation  of  Bartlett's  Test  of 
Homogeneity  of  Variance-Porosity  Field  2 


vi 


Xszll 


log  S, 


v.  log  S. 


2      =r- 


v.  S. 


2 


1 

23 

1.933 

.29732 

6.8384 

.044873 

45.609 

2 

27 

2.019 

.30514 

3.2388 

.037037 

54.513 

3 

25 

I.283 

.IO823 

2.7058 

. 040000 

32.075 

4 

26 

2.551 

.39967 

10.3914 

.03846 

66.326 

5 

17 

2.633 

.42537 

7. 2313 

.05882 

44.761 

6 

35 

3.903 

. 94954 

33.2339 

.02857 

3H.605 

7 

13 

1.292 

.11126 

1.4464 

.07692 

16.796 

8 

22 

1.7^0 

. 24055 

5. 2921 

. 94546 

38.280 

9 

33 

6.071 

.73326 

25.8476 

.03030 

200.343 

10 

19 

2.282 

.35832- 

6.8081 

. 05263 

43.35-; 

11 

30 

4.023 

. 60455 

18.1365 

. 03333 

120.690 

12 

18 

8.938 

.95124 

17.1223 

. 05555 

160.884 

13 

14 

9.458 

. 97580 

13.6612 

. 07143 

132.412 

14 

29 

3.109 

.49262 

14.2860 

.03448 

90.161 

331 

171.2396 

.64788 

1357.813 

■^Converting  porosity  data  to  percent  (.199  to  19.9)  changes 
variance  from  .0001983  to  I.983  percent. 


K  =  14   v  =  331 
(5-13)      B  =  ^10259  (v  log  s2  _  2    lQg  s  2j 
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S(I  )"  -  1 


(5-1^)      C  =  1  +  i --;   1    1   =  .0030211 

3  (K-l)    v   ^1 

C  =  l  ---  .^4788  -  .0030211 
3  (14-1) 

C  =  l  +  .01433  =  1.01433 

s2  =2Vi  Si2  =mhm    m   4.1022 
v  331 

log10  S2  =  .61302 

(5-13)    B  =  (g;ggg||)(toD  (.63302)  -  171.2396) 

B  =  (2.27006) (31.67) 
B  =  71.98   • 
P       TO 

From  X~  table   with  13  degrees  of  freedom  (l4-l) 

Y~  =  22.4  (at  5  percent  level) 
X  =  27.7  (at  1  percent  level) 
X2  =  29-8  (at  1/2  percent  level) 

A  X2  of  29.8  at  the  1/2  percent  level  indicates  that 
there  is  only  1  chance  out  of  200  that  the  differences 
between  the  variances  could  be  caused  by  random  variation  if 
the  calculated  B  >  29.8.   Since  the  B  =  71.98,  it  may  be 
concluded  that  the  differences  between  the  variances  of  the 
14  wells  are  significant  and  cannot  be  accounted  for  by 
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p 

random  variations.   Therefore  the  hypothesis  that  a*      = 

2  2 

o~     =  ....  a,   is  rejected. 

Summary 

Various  statistical  tests  that  may  be  used  to  analyze 
core  samples  by  use  of  the  core  means  and  variances  have 
been  illustrated.   A  core  sample  provides  certain  information 
about  the  population  from  which  it  is  extracted.   The  res- 
pective populations  sampled  by  core,,  corep,  ....  core.,  ^ 
for  Field  2  may  be  thought  of  as  falling  into  one  of  four 
categories: 

1.  The  14  samples  represent  identical  populations,  the 
apparent  differences  resulting  from  sampling  error. 

2.  They  represent  identical  variances  but  have  different 
means . 

3.  They  exhibit  identical  means  but  different  variances. 

4.  They  are  totally  different. 

Tne  F  test  indicated  that  the  populations  had  different 
means.   Bartlett's  test  showed  that  the  populations  had 
different  variances.   Therefore  we  may  conclude  that  the 
populations  represented  by  the  14  samples  are  different. 

By  use  of  the  Tukey  test,  and  the  Sequential  test  it 
is  possible  to  determine  what  sub-groupings  of  the  14 
populations  are  statistically  homogeneous.   An  indication 
that  any  particular  well  is  statistically  the  same  as  a 
group  of  other  wells  should  prove  to  be  a  valuable  tool  in 
correlating  the  expected  performance  of  that  well  with  the 


113 

performance  of  the  group  with  which  it  may  be  identified. 

In  this  chapter  particular  attention  has  been  given  to 
the  distribution  of  sample  means  and  of  sample  variances, 
yet  in  reality  there  is  very  little  exact  knowledge,  based 
upon  sedimentation  theory,  as  to  the  appropriate  distributions 
to  expect.   It  is  to  be  hoped  that  the  methods  of  the 
present  chapter  will  be  useful  for  quantitatively  character- 
izing these  distributions  and  thereby  aid  in  the  formulation 
of  more  precise  theories  of  sedimentation. 


CHAPTER  VI 
SAMPLING 


Introduction 


The  prime  purpose  of  core  analysis  is  to  gain  some 
understanding  of  the  characteristics  of  the  field  properties 
under  consideration,  namely  porosity,  permeability,  and 
fluid  saturations.   As  previously  discussed,  once  sufficient 
information  becomes  available,  it  may  be  used  to  determine 
descriptive  parameters  for  a  field.   The  parameters  may  be 
expressed  simply  in  terms  of  averages  and  deviations  from 
the  averages.   Or  more  extensive  parameters  may  be  developed 

in  terms  of  theoretical  frequency  curves  and  the  moment 

p 

parameters  ou  and  6. 

Considering  the  initial  period  in  the  life  of  a  field 
before  an  extensive  amount  of  data  becomes  available,  the 
question  arises  as  to  what  inferences  can  be  derived  from 
the  data  as  it  becomes  available. 

Sampling 

Sampling  may  be  defined  as  the  selection  of  part  of  an 
aggregate  or  population,  on  the  basis  of  which  a  judgement 
or  inference  about  the  aggregate  or  population  is  made. 
This  sampling  theory  is  a  study  of  relationships  existing 
between  a  population  and  samples  drawn  from  the  population. 
Statisticians  have  long  worked  within  the  problem  of  reconstruc< 
tion  of  a  universe  of  variables  by  means  of  samples  that 
comprise  a  small  percentage  of  the  universal  or  population 
from  which  the  samples  were  drawn.   Core  analysis  data  appears 
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to  comply  with  the  requisite  of  random  sampling  as  stipulated 
by  theoretical  statistics. 

Sampling  theory  is  useful  in  estimation  of  the  unknown 
population  quantities  such  as  population  mean,  standard 
deviation,  variance,  etc.,  referred  to  as  population  para- 
meters, from  a  knowledge  of  corresponding  sample  quantities. 

In  general  a  study  of  inferences  made  concerning  a 
population  by  use  of  samples  drawn  from  it  together  with 
indications  of  the  accuracy  of  such  inferences  using  prob- 
ability theory,  is  called  statistical  inference. 

Considering  a  core  analysis  as  a  random  sample  drawn 
from  an  infinite  population,  the  sub-surface  of  the  earth, 
what  inferences  can  be  made  about  the  population?  From  the 
core  analysis  there  may  be  obtained  a  sample  mean  and  a 
sample  standard  deviation  where: 

(6-1)  x  =  S£ 

(6-2)  Sx  =  Z(X-X)2 

The  sample  mean,  which  serves  as  an  estimate  of  the 
population  mean  will  be  of  chief  interest  in  considering  a 
core  sample.   An  important  theorem  in  statistics  states 
"that  for  almost  all  populations  the  probability  distribution 
of  the  sample  mean  based  upon  a  simple  random  sample  will  be 
an  approximately  normal  one  if  the  sample  size  is  sufficiently 
large.    The  standard  deviation  of  the  probability  distribu- 
tion  of  the  sample  mean,  denoted  by  cfe*  is  calculated  by:^ 
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(6-3)  <*-  — 

X  v^ 

where  Oy.   is  the  standard  deviation  of  the  population  being 
sampled,  and  n  is  the  sample  size. 

Therefore  if  the  standard  deviation  of  the  population 
is  known,  Cy  may  ^e  calculated.   However,  after  only  one 
core  analysis,  the  standard  deviation  of  the  population  is 
not  known  and  an  estimate  of  this  standard  deviation  must  be 
made.   A  point  estimate  of  this  population  characteristic 
is  made  from  the  sample  standard  deviation,  S„,  and  the 
sample  mean  is  the  best  point  estimate  available  for  the 
population  mean. 

To  illustrate  the  application  of  point  estimates,  the 
numerical  data  obtained  from  an  assumed  first  core  analysis 
taken  from  Field  2,  Figure  28,  will  be  used. 

From  the  first  core  analysis,  well  1,  we  obtain: 

X.  =  18. 6$ 
SX0  =2.98 
Xso  =  45.8^ 

sXso=  4-56 

sxsw=  *•«■ 

n  =  J>6 

Therefore  the  best  estimate  of  the  various  population 
means  and  standard  deviations  are: 


FIGURE  28  MAP  OP  WELLS  CORED— FIELD  2 
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and, 


^population  =  l8-6^ 

X  =  45. 8£ 

population 

population 


aX0  =  2.98 

*Xso=  *■* 
°xsw=  4-14 


And,  we  assume  that: 


°X   estimate   2.98    .,0 
^Yrh   =  =  —   =  •  ^° 


aXso  -  -76 


*Xsw  =  «69 

For  normal  probability  distributions,  the  area  under 

the  normal  distribution  curve  between  the  mean  +  2  standard 

k 

deviations  is  about  .95  out  of  a  total  area  of  1.   Therefore 

the  following  limits  may  be  constructed  with  a  confidence 

of  .95. 

(6-4)  X  +  2  a   X 

X.  +  2(.48)  =  18.6  +  .96 

17.64  -  19.56 

Xso  +  2(.76)  =  45.8  +  1.52 
44.2  -  47.4 

Xsw  +  2(.69)  =  45.5  +  1.38 
43.9  -  46.7 
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We  can  conclude  with,  a  . 95'  probability  that: 
X,  for  Field  2  is  somewhere  between  17.64  and  19.56$. 
Y.   for  Field  2  is  somewhere  between  44.2  and  47.4$. 
X"  T  for  Field  2  is  somewhere  between  4^.9  and  46.7$. 

The  above  conclusions  are  based  on  knowledge  of  the 
true  value  of  cr„  which  was  not  the  case.   Statistical 
history  has  indicated  though  that  if  the  sample  size  is 
reasonably  large,  n  >  30,  the  S„  may  be  used  as  an  estimate 
of  a^   without  materially  changing  the  reliability  of  the 
above  conclusions. 

Control  Charts 

From  the  previous  section,  estimates  of  the  probable 
range  of  the  various  property  means  of  Field  2  were  deter- 
mined.  The  ranges  were  based  upon  one  core  analysis  which 
was  taken  from  one  point  in  the  field.   For  the  population 
which  this  core  sample  represents,  the  sample  mean  ranges 
determined  should  prove  to  be  satisfactory  range  estimates 
of  the  population  mean.   As  drilling  moves  away  from  this 
point,  a  different  areal  population  may  be  encountered  with 
different  characteristics  from  those  determined  by  the 
initial  core  sample.  -In  this  case  it  would  be  expected 
that  the  new  population  characteristics  may  deviate  from 
those  estimated  from  the  initial  core  sample.  , 

In  statistical  quality  control  procedures  for  production 
processes,  a  control  chart  is  used  to  indicate  when  a  process 
has  changed  or  "gone  out  of  control"  and  is  no  longer 
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producing  within  prescribed  specifications.   Basically  the 
production  control  procedures  consist  of  computing  a  process 
mean  and  standard  deviation,  constructing  a  control  chart 
within  the  limits  of  Xprocess  +  3  o^,    taking  periodic 
samples  from  the  production  line  and  computing  the  mean  of 
the  samples  taken.   If  the  mean  falls  between  the  limits 
set  on  the  control  chart,  it  is  concluded  that  the  process 
has  not  changed  or  ::is  in  control.  "  If  the  mean  does  not 
fall  within  the  limits,  it  is  concluded  that  the  process  has 
changed  or  "is  out  of  control."-5 

Core  sampling  may  be  considered  analogous  to  production 
line  sampling  with  areal  sampling  of  populations  considered 
to  be  the  analog  of  periodic  process  sampling.   As  core 
samples  are  taken  at  varying  distances  from  the  initial 
sample,  the  means  may  be  plotted  on  a  control  chart  to  indicate 
if  the  population'  characteristics  are  still  "under  control,  '[ 
or  if  not,  whether  a  new  areal  population  has  been  encountered. 
?or  a  control  chart,  +  3  (Jf  is  normally  used.   Thus:   (from 
previous  section) 

*0  +  3  o-s  =  18.6  +  3(.48) 
18.6  +  1.44 
17.16  -  20.04 

^o  +  5  fly  =  45.8  +  3(.76) 

b<J  —        .A.  ~~ 

45.8  +   2.28 

43.52  -  48.08 
Xsw  +  3  oj   =  45.3  +  3(.69) 
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^5-3  +  2.07 
43.23  -  Vr.37 

Using  the  above,  a  control  chart  may  be  constructed. 
Prom  Figure  28  it  can  be  observed  that  the  wells  are  located 
in  a  general  north-south  direction  with  well  1  located 
approximately  at  the  center  of  the  field. 

For  this  particular  geographical  arrangement,  two 
control  charts  to  test  the  hypothesis  that  the  mean  of  the 
field  falls  within  the  probable  range  are  constructed,  one 
for  moving  in  a  northerly  direction,  one  for  the  southerly 
direction. 


For  moving  North: 


u 


20 


/<? 


/& 


n 


/*. 


All 


Ml  A/3 


Wells 


Alf  Af6 


/Upper 


Control 


Limit  20.04 


Average  18.6 


-Lower  Control' 
•Limit  17.16 


J L 


FIGURE  29  Control  Chart  for  Average  Porosity 
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The  core  analysis  from  well  Nl  has  a  X,  of  19.2.  .This 
is  plotted  on  the  control  chart,  and  falls  within  the  limits 
set.   It  is  concluded  that  the  sampling  is  still  under 
control  and  the  population  has  not  changed  from  that  indicated 
by  the  initial  sample.   As  the  core  analysis  from  well  N2 
through  No  are  taken/  their  means  are  plotted  on  Figure  29. 

Moving  north  from  initial  well: 

Well  X. 

, 0 


Nl 
N2 
N5 

N5 
N6 


19.2  under  control 
20.5  possibly  out  of  control 
19.9  under  control 
18.9  under  control 
19.9  under  control 
19.0     under  control 


It  appears  that  the  probable  range  of  the  field  mean 
calculated  from  the  initial  core  analysis  gives  a  good 
representation  of  the  field  mean  from  the  initial  point  to 

well  N6. 

For  moving  South: 
% 
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FIGURE  30  Control  Chart  for  Average  Porosity 
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Moving  south  from  initial  well: 

Well  X. 

0 

51  18.4  under  control 

52  19. 1  under  control 

53  20.8  possibly  out  of  control 

54  20.8  more  indication  of 

out  of  control 

55  21.2         out  of  control 

For  movement  in  a  southerly  direction  the  initial 
conclusion  concerning  the  field  mean  proved  correct  from  the 
initial  point  to  well  S2.   When  it  becomes  apparent  that 
the  initial  hypothesis  concerning  the  mean  is  not  tenable, 
in  this  case  after  well  S4  was  drilled,  a  new  probable  range 
of  the  areal  mean  should  be  calculated  from  the  data  obtained 
from  well  S3  and  S4.   This  new  probable  range  would  then  be 
the  best  estimate  of  the  population  mean  for  the  areal 
population  from  S4  southward. 

The  adoption  of  the  above  method,  which  is  based  on 
production  quality  control  techniques,  to  the  analysis  of 
reservoirs  is  one  possible  way  in  which  a  quantitative  criterion 
could  be  used  to  segment  a  field  into  smaller  homogeneous 
units.   Many  extensions  to  this  procedure  are  possible. 
These  involve  the  comparison  of  the  segmentation  created  by 
use  of  the  different  reservoir  properties.   In  this  regard 
it  is  possible  that  the  areal  segments  would  not  coincide 
when  using  different  properties  such  as  porosity,  permeability, 
or  fluid  saturations.   However  there  is  also  no  "a  priori" 
reason  for  their  being  different.   Similarity  of  areas 
would  probably  indicate  a  fairly  high  degree  of  correlation 
between  the  variables  involved. 
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Confidence  Interval  for  Population  Proportion 

If  a  random  sample  size  is  large,  n  >  30  usually  being 
considered  sufficiently  large,  a  confidence  interval  for  the 
population  proportion  can  be  constructed  in  a  similar  way 
as  for  the  population  mean,  since  the  probability  distribu- 
tion of  the  sample  proportion  will  be  assumed  to  be  approxi- 
mately normal. 

The  standard  deviation  of  the  probability  distribution 

6 
of  the  sample  proportion  can  be  estimated  from: 


(6-5)         .       S-  ^p(I-p) 

F     n-1 

where  p  is  the  sample  proportion  and  S—  denotes  the  estimate 

of  the  true  standard  deviation  a—. 

P 

The  confidence  interval  for  the  population  proportion 
is: 
(6-6)  p  +  Z  S- 

where  Z  is  the  normal  deviate  corresponding  to  the  desired 

7 
confidence  coefficient. 

To  illustrate  the  application  of  confidence  intervals 
for  population  proportions  to  core  analysis,  suppose  that 
it  has  been  decided  that  an  oil  saturation  of  30  percent  or 
less  is  undesirable  in  considering  the  amount  of  pay  zone  in 
a  field.   The  initial  core  analysis  taken  from  Field  2  will 
be  used  for  example  calculations. 

There  were  36  samples  from  well  1,  6  of  which  were  £  30 
percent.   Thus: 
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P=S-  =  1  =  .167 


and, 


.    =^EMCM  _  .066 

P       35 

o 

And  for  a  confidence  coefficient  of  .95,  Z  is  1.96, 

p  +  Z  S- 
-    P 

.167  +  1.96  (.066) 
.167  £  .129 

Thus  it  may  be  concluded  with  a  confidence  coefficient 
of  95  percent  that  the  percentage  of  the  field  pay  zone  that 
is  undesirable  according  to  the  predetermined  criteria  is 
somewhere  between  3.8  percent  and  29.6  percent. 

Another  way  to  interpret  the  use  of  the  population 
proportion  and  its  relation  to  the  confidence  coefficient  is 
illustrated  by  the  following  modification  of  the  above 
problem.   Instead  of  establishing  a  confidence  coefficient 
of  0.95  and  determining  the  range  in  the  percent  of  formation 
which  would  be  unproductive,  suppose  that  one  seeks  to  know 
the  confidence  coefficient  associated  with  a  particular 
percentage  range  such  as  1/6  +  3  percent,  or  .167  +  .03. 
One  then  determines  the  value  of  Z  corresponding  to  the  + 
3  percent  range.   In  this  case: 

interval  in  population  proportion  =  p  +  Z  S— 

A  value  of  Z  is  sought  such  that  Z(S— )  =  .03  or  Z,= 
.03/.  O06  =  .455.   This  value  of  Z  corresponds  to  a  confidence 
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coefficient  of  .35.   The  interpretation  is  that  the  true 
unproductive  fraction  of  the  formation  has  2  chances  out  of 
3  to  be  outside  of  the  proportion  .  167  +  .03  or  a  1  chance 
in  three  (approximately  35  percent)  of  lying  within  the 
proportion  range  .167  +  .03. 

Summary 

Two  rather  elementary  techniques  of  estimating  informa- 
tion about  a  field  when  the  amount  of  core  analysis  data  is 
limited,  in  this  case  one  well,  have  been  presented.   For 
Field  2,  the  sample  mean  of  well  1  gave  good  results  as  an 
estimation  of  the  population  mean  for  the  areal  population 
from  well  S2  to  NS. 

The  utility  of  these  sampling  techniques  can  only  be 
confirmed  after  being  tried  and  tested  with  a  large  number 
of  fields  where  the  actual  results  could  be  compared  with 
the  estimated  predictions.   It  would  be  conceivable  to  assume 
that  over  the  long  run,  sampling  techniques  would  provide  a 
better  estimate  of  what  is  to  be  expected  than  those  estimates 
determined  intuitively.   Granted,  in  some  cases  the  reservoir 
predictions  by  sampling  techniques  would  prove  to  be  in  error, 
and  yet  the  estimates  could  show  less  error  than  the  error 
or  difference  found  by  intuitive  predictions. 

Sampling  results  should  be  recognized  as  only  estimates 
and  not  as  infalliable  "predictions.   Yet  the  sampling 
estimates  should  prove  better  than  no  estimates  at  all. 
For  any  one  reservoir  the  estimates  are  either  right  or 
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wrong.   The  .95  confidence  coefficient  means  that  if  the 
procedure  is  followed  for  a  long  enough  time  over  a  sufficiently 
large  number  of  reservoirs,  the  results  should  be  correct 
95  percent  of  the  time. 

It  was  the  purpose  of  this  discussion  on  sampling  to 
indicate  how  certain  techniques  could  be  applied  and  possibly 
to  stimulate  the  study  of  the  applicability  of  other  more 
advanced  or  specific  sampling  techniques.   It  is  felt  that 
the  work  and  investigation  in  this  area  has  just  begun  with 
unlimited  possibilities  yet  to  be  explored. 


CHAPTER  VII 
SUMMARY  AND  CONCLUSIONS 


Introduction 


At  the  outset  of  this  study  two  alternate  courses  of 
investigation  were  considered  for  the  statistical  analysis 
of  the  field  data  available;  the  first,  to  concentrate  on 
one  statistical  technique  and  analyze  it  in  detail,  and 
second,  to  explore  various  techniques  with  less  detailed 
analysis  of  any  one  point.   The  latter  course  was  chosen 
with  three  very  broad  areas  included  in  the  study,  i.e., 
generalized  or  skewed  distribution  curves  including  trans- 
formation  of  data  to  logrithms  for  analysis,  analysis  of 
variances,  and  sampling  procedures. 

The  choice  of  this  latter  course  was  prompted  by  the 
realization  that  relatively  little  use  of  theoretical 
statistics  has  been  made  in  core  analysis  as  judged  by 
available  discussion  in  the  petroleum  literature.   An 
attempt  was  made  to  present  in  one  volume  the  necessary 
calculation  procedures  and  interpretations  to  serve  as  a 
broad  guide  for  a  statistical  analysis  of  field  data. 
Additionally,  it  is  hoped  that  the  study  will  serve  as  a 
basis  for  further  investigation  into  the  applicability  of 
the  science  of  statistics  to  reservoir  problems. 


Summary 


This  thesis  has  attempted  to  indicate  the  applicability 
of  certain  statistical  procedures  to  the  analysis  of  core 
samples.   Statistical  techniques  were  employed  to  segregate 
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relevant  and  meaningful  Information  from  a  mass  of  raw  data. 

A  method  of  determining  and  describing  frequency  distri- 
butions of  the  physical  properties  of  a  reservoir  is  presented 
in  Chapter  III.   The  Pearson  family  of  generalized  frequency 
curves  provides  an  effective  means  of  classifying  observed 
frequency  distributions  without  being  restricted  to  a  few 
classical  distributions  such  as  the  normal  curve,  the  Poisson 
curve  or  the  binomial  distribution. 

Chapter  IV  is  devoted  to  the  study  of  the  effects  of 
the  transformation  of  raw  data  to  logrithms  for  analysis. 
The  results  obtained  were  compared  with  the  results  of  Jan 
Law ' s  work . 

Analysis  of  variances  is  studied  in  Chapter  V  with 
various  testing  procedures  shown.   The  F  test  and  Bartlett's 
test  are  employed  to  test  for  uniformity  of  the  population 
represented  by  the  various  well  samples.   The  modified 
Tukey  and  Sequential  tests  are  used  to  determine  what  sub- 
grouping  of  the  wells  are  statistically  homogeneous. 

Sampling  procedures  are  presented  in  Chapter  VI. 
Sample  means  are  analyzed  and  presented  on  control  charts 
by  techniques  analogous  to  production  line  sampling  techni- 
ques.  Population  proportions  are  presented  with  confidence 
intervals. 

The  following  listing  presents  a  summary  of  the  main 
eouations  associated  with  the  above  statistical  procedures. 
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For  Frequency  Distributions 
Pearson  Type  I  Curve 


(2-24) 


Type  IV  Curve 


m-.       m, 
Y  =  Cft-i^)  1  (r2-t)  ' 


IT 


— 1  /t+->?\     p 

(2-51)     Y  =  C[(t+r)2  +  S2]~m  e-v  tan   <— >  ev 


Type  VI  Curve 
(2-3^) 

Normal  Curve 
(2-40) 

Type  III  Curve 
(2-43) 


m0      m, 
Y  =  C(Z  d)(Z-a)  ± 


Y  =  C  e 


t; 

2 


Y  =  Cx(A+t) 


A  -1   -At 
e 


The  F  Test 
(5-10) 

where: 

(5-9) 

and, 
(5-3) 


For  Analysis  of  Variance 


F  = 


o 


Se 


* .  ■ 

72 


S  2  =  n  S2 
p     o   m,p 


n- 


Se  = 


(nx  + n^.)  -  n± 


(5-H) 
(5-12) 


Modified  Tukey  and  Sequential  Testing 


*X 


'X 


yn 


D  =  Q  S^ 


Bartlett ' s  Test 

(5-15)  B  =  i(v  In  S2   -  2  v±  In  S±2) 

v/here:  -,        ^ 

(5-14)  c  =  l  +  ZjV±'v 

3    (K-l)     •• 

For  Sampling 
(6-2)  sx  .^%Ef 


(6-3)  oj  = 


(6-5)  s-  =yP^:P) 

*     ^'  ^?  n-l 


(6-6)  p  -  +  Z  &- 
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Conclusions 

In  conjunction  with  the  comments  given  in  the  summary 
sections  of  the  specific  chapters,  the  major  conclusions  to 
be  derived  from  the  data  studied  are  "believed  to  be: 

1.  Generalized  frequency  curves  are  useful  in  reflecting  the 
variation  of  individual  properties  throughout  a  single 
depositional  unit,  and  in  characterizing  and  distinguish-  ' 
ing  the  unit  from  other  units  in  the  same  geological 
basin.   The  terms  cu  and  5  should  provide  additional 
relevant  numerical  parameters  for  mathematical  model 
analysis  of  a  reservoir. 

2.  Logrithmic  transformation  is  occasionally  useful  to 
remove  some  skewness  from  permeability  data,  but  does 
not  necessarily  convert  the  data  to  the  normal  distribu- 
tion under  all  circumstances. 

3.  Analysis  of  variances  tests  provide  a  useful  tool  to 
investigate  homogeneity  between  well  properties  within  a 
reservoir  or  within  sub  areas  of  a  reservoir. 

4.  Sampling  techniques  may  be  utilized  to  estimate  unknown 
population  quantities  from  a  knowledge  of  relatively 
small  sample  quantities. 

5.  The  science  of  theoretical  statistics  applied  to  reservoir 
analysis  is  in  its  infancy  and  appears  to  offer  many 
opportunities  for  the  solution  of  reservoir  problems. 
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Recommendations  for  Further  Investigation 

While  working  with  the  Pearson  frequency  curves,  certain 
areas  of  further  investigation  appeared  to  offer  promise: 

1.  An  investigation  into  the  relationship  between  reservoir 
performance  and  the  type  of  frequency  distribution 
followed  by  its  physical  properties.   For  example,  does 
a  field  with  a  Pearson  Type  III  S  distribution  perf orm 
differently  than  a  field  with  a  Type  I,  other  factors 
being  the  same?  What  is  the  correlation,  if  any, 
between  the  parameters  (o^,  5)   and  field  performance, 

the  parameters  ou  and  6  being  considered  for  any  physical 
property  or  group  of  properties? 

2.  Sampling  techniques  are  available  for  populations  that 
approximate  the  normal  frequency  distribution.   Could 
reliable  techniques  be  developed  for  populations  that 
satisfy  certain  types  of  the  Pearson  family  of  frequency 

distributions?  A  table  of  Areas  Under  the  Type  IIXq 

p 
Curve  has  been  developed  and  could  possibly  serve  as  a 

point  of  departure  for  investigating  sample  techniques 

for  core  samples. 

Numerous  tests  have  been  devised  for  the  analysis  of 
variances.   Certain  of  these  tests  were  presented  in  this 
thesis.   Other  tests  may  prove  more  useful  than  the  ones 
presented.   Further  investigation  into  this  area  of  statisti- 
cal analysis  is  highly  recommended. 
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APPENDIX  A 

MATHEMATICAL,  ' FORTRAN,  AND  TECHNICAL  NOTATIONS  AND  TERMS 


Description 

A  parameter  of  the  Pearson  Type  III 
Curve,  equal  to  2/ALP3 

Square  of  A 

Parameter  of  the  differential  equa- 
tion (2-1);  also  ckewness  of  the 
distribution  of  a  set  of  data  defined 
by  equation  (2-13 )i  also,  a  term  used 
in  the  Sequential  method  of  testing 
to  describe  the  range  of  means  in  an 
array 

AJ    Number  of  measurements  in  a  particular 
class  interval 

AK    Total  number  of  measurements 

th 
a  The  n   a  term  where  a  = 

n  th  n 

S moment  above  mean^  see  eq^ation 

(2-7) 

a        ALPHA   A  parameter  of  the  Type  VI  Pearson 

Curve,  see  equation  (2-37) 

cu        ALP3    A  basic  parameter  of  the  Pearson 
^  System  of  Frequency  Curves,  see 

equation  (2-7A) 

Square  of  ALP3 

A  basic  parameter  of  the  Pearson 
System  of  Frequency  Curves,  see 
equation  (2-7A) 

Number  of  class  intervals 

Intermediate  variable  for  computing 
YJ3AR  of  the  type  curve 

ARGrl    Intermediate  variable  for  computing 
YBAR  of  the  type  curve 


v 

'ALP^S 

a^ 

AL?k 

AN 

ARG 
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Mathematical 
Notation 

B 


P 


Fortran 
Notation 


BELSQ 
BSQ 


CHI 


CHISQ 


CON 


CST 


D 


D 


Description 

A  statistic  calculated  in  Bartlett's 
test  for  variance  homogeneity,  see 
equation  (5-15) 

An  intermediate  value  for  derivation 
of  the  constant  for  Type  I  curve, 
see  equation  (2-28) 


bp  are  terms  of  equation  (2-2), 


V  bi> 

where  f(t)  is  expanded  in  a  converg- 
ing power  series,  values  of  terms  are 
defined  in  equation  (2-11) 

Summed  BSQ  between  observed  data  and 
Normal  Curve 

Equivalent  to  an  individual  *  g  ' 
on  page  tf-$     for  Normal  Curve 

A  constant  in  the  frequency  curve  for 
a  particular  set  of  data,  also  a 
statistic  calculated  in  Bartlett's 
test,  see  equation  (5-14) 

Cf-F)^ 
Equivalent  to  an  individual  »  Z   ' 

on  page  </'zT  for  Pearson  Curve 

Summed  CHI  between  observed  data  and 
Pearson  Curve--a  statistical  distri- 
bution used  for  certain  statistical 
testing 

Storage  locations  for  all  parameters 
and  the  constant  characteristic  of  a 
given  Pearson  Type  Curve 


Pearson  Type 

I,  VT 

III 

IV 

CONI 

I1) 

EMI 

A 

R 

2) 

EM2 

C 

EM 

5) 

Rl 

S 

4 

R2 

V 

3) 

C 

c 

The  Gaussian  coefficients  (l6)  for 
integration  in  determining  F(R,V), 
see  Ref .  /*?   Chapter  3  for  source  of 
coefficients 

An  intermediate  parameter  for  all  main 
Pearson  Type  Curves,  see  equation  (2-15) 


Mathematical 
Notation 


Fortran 
Notation 


DAVG 

DD(I) 

DECM 


DELD 

6 

DELTA 

DMAX 

DMIN 

dmp; 

y/D 

DSQRT 

m 

EM 

ml 

EMI 

m2 

EM2 

EU 

1^5 

Description 

For  analysis  of  variance,  a  computed 
difference  between  means  that  is 
significant  at  a  chosen  level,  see 
equation  (5-12) 

Arithmetic  mean  or  average  of  data, 
see  equation  (6-11) 

Midpoint  value  for  class  interval  I 

Difference  between  the  highest  value 
permissable  in  a  given  interval  and 
the  lowest  permissable  value  in  the 
next  higher  interval 

Difference  between  maximum  and 
minimum  value  of  a  class  interval 

A  basic  parameter  of  the  Pearson 
System  of  Frequency  Curves,  see 
equation  (2-12) 

Maximum  value  for  data  in  highest 
class  interval 

Minimum  value  for  data  in  lowest 
class  interval 

Term  used  to  locate  interval  which 
contains  the  mean 

The  square  root  of  D,  see  equation  (2-15) 

A  parameter  in  the  Type  IV  Curve,  see 
equation  (2-50) 

A  parameter  in  Type  I  and  VT  Curves, 
see  equation  (2-25) 

A  parameter  in  Type  I  and  VI  Curves, 
see  equation  (2-25) 

The  independent  normalized  variable 
for  the  interval  -  1/2  to  +  1/2  based 
upon  16  subdivisions  for  Gaussian 
integration.   The  16  values  of  EU  are 
taken  from  Ref .  n   Chapter  2 
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Notation    Notation  Description 

EZ     The  term  (Sin  9)r  ev  used  to  determine 
F(R,V),  see  equation  (2-33) 

F  A  ratio  devised  by  Fisher  to  test  for 

homogeneity  of  means,  see  equation  (5-10) 

p 
fe  Degrees  of  freedom  associated  with  Se  , 

see  equation  (5-^) 

FRV    F(R,  V)  function,  used  for  constant 

of  Type  IV  curve,  see  Ref .  /*?  ,  Chapter  2 

G  A  function  used  in  the  derivation  of 

the  constant  from  Type  IV  curve,  see 
equation  (2-33) 

f         G(J)    Frequency  of  values  in  an  interval 

i  I     Index,  usually  denoting  interval  number 

1  Term  to  indicate  a  complex  root  in 

derivation  of  Type  IV  curve,  see 
equation  (2-30) 

ID     Identification  number  for  the  field, 
well,  and  lease 

IDEPT   Depth  of  sample,  feet 

INDEX   An  index  to  distinguish  permeability 
data  cards  from  other  data  cards, 
INDEX  =  1  for  permeability 
INDEX  =  2  for  other  data 

II     An  index  to  distinguish  the  first 

cycle  of  a  loop  from  all  subsequent 
cycles 

IPERM   Permeability,  millidarcys 

0        IPHI    Porosity,  fractional 

ISAMP   Sample  number 

S  ISO    Oil  saturation,  fractional 

o 

00  Infinity 

J     Index  denoting  number  of  different 
Pearson  Type  Curve  fits  desired 
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Notation    Notation  Description 

k  Number  of  variances  considered  in 

Bartlett's  test 

L     Identification  Program — numerical 

equivalent  of  AN  in  fixed  point  notation 

L     Fitting  Program — index  used  in  inte- 
gration for  YAVG 

MP     Number  of  the  interval  counting  from 
the  lowest  which  contains  the  mean 

NA     Number  of  intervals  in  fixed  point 
notation 

NDEX    An  index  to  distinguish  among  porosity, 
oil  saturation  and  water  saturation 
data  cards: 

NDEX  =  1  for  porosity 

=  2  for  oil  saturation 
=  3  for  water  saturation 

n  An  estimate  of  an  average  sample  size, 

see  equation  (5-8) 

NTYPE   Type  of  Pearson  Curve  to  be  calculated: 
NTYPE         Pearson  Type 

1  I 

2  III 

3  IV 

4  VI 

NUMBR   Number  of  different  type  of  curves  to 
be  calculated  in  a  computer  run 

PI     An  intermediate  variable  for  the 
Type  IV  curve 

P  A  sample  proportion,  see  equations 

(6-5)  and  {6-6) 

q        PROP    Variable  for  an  individual  physical 

property  such  as  porosity,  permeability, 
etc. 

ir  An  interval,  scale  for  the  logrithms 

of  permeability  data 

Q,  A  factor  obtained  from  a  table  to 

compute  D,  see  equation  (5-12) 
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Description 

Mean  value  of  q,  see  equation  (3-5) 

The  interval  of  q  used  to  construct 
the  histogram  on  the  q  scale 

A  parameter  of  the  Pearson  Type  IV 
Curve,  see  equation  (2-30) 

A  parameter  of  the  Pearson  Type  I  and 
VI  Curves,  see  equation  (2-15; 

A  parameter  of  the  Pearson  Type  I  and 
VI  Curves,  see  equation  (2-15) 

A  parameter  of  the  Pearson  Type  IV 
Curve,  see  equation  (2-30) 

The  standard  deviation  of  the  set  of 
data,  see  equation  (3-9) 

A  term  in  Bart let t ' s  test  equal  to 

2  yi  si2 

v 

p 
So  Within  sample  or  error  variance, 

see  equation  (5-3) 


Mathematical 
Notation 

Fortran 
Notation 

q 

Aq 

r 

R 

ri 

Rl 

r2   . 

R2 

S 

S 

a 

SIGMA 

S2 

e 


p 

S.  Sample  variance  in  Bartlett's  test, 


i 


see  equations  (5-I3)  and  (5-1*0 


2 
S  m,  p  Variance  of  the  population  of  means, 

see  equation  (5-5) 

2 
S  An  estimate  of  the  population  variance, 

p  see  equation  (5-7)  and  (5-9) 

S—  Standard  deviation  of  the  probability 

•P  distribution  of  the  sample  proportion, 

see  equation  (6-5) 

S  The  sample  standard  deviation,  see 

x  equation  (6-2) 

S  A  sample  variance,  see  equation  (5-2) 

S—  The  standard  deviation  of  the  probability 

x  distribution  of  the  sample  means,  see 

equation  (5-11) 
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Notation    Notation  Description 

cr—  The  standard  deviation  of  the  prob- 

ability distribution  of  the  sample 
mean,  see  equation  (6-3) 

SUM  An  intermediate  storage  cell  used  for 
the  Gaussian  integration  in  determin- 
ing F(R,  V) 

SUMD   Sum  of  data  values  in  an  interval 

SUMXG(I)   Sum  for  the  I'th  moment  about  the 

midpoint  of  the  interval  containing 
the  mean,  see  Table  IV 

t  T     Independent  variable  for  frequency 

distributions,  in  standardized  nota- 
tion, see  equation  (3_i0 

TANM1   Intermediate  variable  for  computing 
YBAR  of  the  Type  Curve 

TEMPI   A  temporary  storage  cell  for  an 

intermediate  value  in  the  calculation 
of  the  Type  IV  curve 

TEST    Term  for  determining  whether  a  data 

value  is  within  overall  interval  limits 

©         THETA   The  angle  equal  to  tt(EU)  +  ir/2   for 

use  in  the  P(RV)  calculation 

TOTAL   Total  number  of  property  samples  for 
a  field 

TK     Intermediate  variable  for  computing 
YBAR  of  the  Type  Curve 

nd 
p,p         U2     2   moment  about  the  mean,  see 

equation  (3-3) 

rd 
u_,         U3     3   moment  about  the  mean,  see 

^  equation  (3-3) 

th 
lij,         U4     4   moment  about  the  mean,  see 

equation  (3-3) 

th 
\i  The  r   moment  about  the  mean,  see 

equation  (3-2) 

u..,  «        UP1    First  moment  of  the  histogram  about 

the  arbitrary  chosen  midpoint,  see 
equation  (3-0) 
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Notation    Notation  Description 

u.  '  Second  moment  of  the  histogram  about 

the  arbitrary  chosen  midpoint 

M~  '  Third  moment  of  the  histogram  about 

^  the  arbitrary  chosen  midpoint 

\i -r '        UP(l)   I'th  moment  of  the  data  about  the 

midpoint  of  the  interval  which  con- 
tains the  mean,  see  equation  (3-1) 

u-  The  mode  of  a  given  frequency 

distribution 

Xp        UM0D2   2'nd  moment  about  the  mean  incorporat- 
ing Sheppard's  corrections,  see 
equation  (3-10) 

X^        UM0D3   3'rd  moment  about  the  mean,  incorporat- 
^  ing  Sheppard's  corrections,  see 

equation  (3-11 ) 

Xj,         UM0D4   4'th  moment  about  the  mean,  incorporat- 
ing Sheppard's  corrections,  see 
equation  (3-12) 

v  V     A  parameter  of  the  Pearson  Type  IV 

Curve,  see  equation  (2-30) 

v  A  term  in  Bartlett's  test  equal  to 

v.  Degrees  of  freedom  associated  with  each 

1  sample  variance  in  Bartlett's  test, 

see  equation  (5-13) 

w  A  parameter  in  the  derivation  of  the 

constant  for  Type  I  Curve,  see 
equation  (2-26) 

x.         XBAR    The  independent  variable  for  the 

frequency  distribution  wherein  each 
interval  has  unit  width 

XMP    Equivalent  of  MP  but  in  floating 
point  notation 

_  2 

X  A  term  used  to  compute  S^   ,  see 

P  equation  (5-6)  m'  P 

Y  The  mean  of  a  set  of  porosity  data 

0 
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Notation    Notation  Description 

X^n  The  mean  of  a  set  of  oil  saturation 

so  .  , 

data 

X_,T  The  mean  of  a  set  of  water  saturation 

data 

XX     The  values  of  the  independent  variable 

for  a  frequency  distribution  correspond- 
ing to  the  left,  middle,  and  right 
ends  of  an  interval.   These  intervals 
are  of  unit  width 

XZERO   The  value  of  XX  at  the  left  extremity 
of  a  class  interval 

Y  Y     The  frequency  for  a  Pearson  type 

curve  at  any  value  of  the  independent 
variable,  t 

YAVG    Integrated  average  value  of  frequency 
over  a  specific  interval,  see 
equation  (3-15) 

YNORM   A  point  value  of  frequency  on  the 
normal  curve 

Z  Z     A  parameter  for  the  Pearson  Type  VI 

curve,  see  equation  (2-26) 

z  The  normal  deviate  corresponding  to 

a  desired  confidence  coefficient, 
see  equation  (6-6) 

ZERO    A  numerical  constant,  equal  to  zero 
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Term 


Array 


Chi -Square 

Class 
Frequency 


Notation  Definition 

An  arrangement  of  raw  numerical  data  in 
ascending  or  descending  order  of  magnitude 

X     A  measure  of  the  discrepancy  existing 

between  observed  and  expected  frequencies 

Number  of  individuals  belonging  to  each 
class  in  summarizing  large  masses  of 
raw  data 


Degrees  of 
Freedom 


Gamma 
Function 

Frequency 
Distribution 


Histogram 


Mean 
Median 

Mode 

Moment  s 


Normal 
Distribution 


d.f.    The  number  N  of  independent  observations 
fe    in  the  sample  (i.e.,  the  sample  size) 
v    minus  the  number  K  of  population  para- 
meters which  must  be  estimated  from 
sample  observations 

r(X)   A  transcendental  function  used  frequently 

in  statistics  and  defined  by  equation  (5-12) 

A  tabular  arrangement  of  data  by  classes 
together  with  the  corresponding  class 
frequencies 

A  set  of  rectangles  having:   (a)  bases 
.  on  a  horizontal  axis  (X  axis)  with 
centers  at  the  class  marks  and  lengths 
equal  to  the  class  interval  sizes,  (b) 
areas  proportional  to  class  frequencies 

X    Arithmetic  average  of  a  set  of  data 

The  middle  value  of  a  set  of  numbers 
arranged  in  order  of  magnitude 

jl    That  value  of  a  set  of  numbers  which 
occurs  with  the  greatest  frequency 

p.    A  term  used  in  the  measurement  of 

dispersion  of  a  distribution  and  defined 
by  equation  (3-2) 

Also  referred  to  as  Gaussian  curve, 
normal  curve  of  error,  or  normal 
probability  distribution.   A  bell- 
shaped  distribution  defined  by  equa- 
tion (2-40) 


Term 


Notation 


Definition 
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Oil 
Saturation 


Permeability     K 


Population 


Porosity       0 


Range 

Skewness 

Standard 

c* 

Deviation 

Standardized 

t 

Unit 

Standardized 

z 

Unit 

Unimodial 

Variance 

a 

Water 

s, 

Saturation 

w 


A  measure  of  the  oil  present  within  a 
rock  expressed  as  a  percentage  of  the 
total  fluid  saturation  within  a  rock 

A  property  of  a  porous  medium  and  a 
measure  of  the  capacity  of  the  medium 
to  transmit  fluids.   Usually  measured 
in  darcies  or  millidarcies 

Term  used  in  statistics  to  refer  to  the 
hypothetical  complete  enumeration  of 
facts  in- a  particular  field  of  study 

The  ratio  of  the  void  space  in  a  rock 
to  the  bulk  volume  of  that  rock 
multiplied  by  100  to  express  it  in 
percent 

The  difference  between  the  largest  and 
smallest  numbers  in  a  set  of  numbers 

The  degree  of  asymmetry,  or  departure 
from  symmetry,  of  a  distribution,  see 
equation  (2-l3) 

The  root  mean  square  of  the  deviations 
from  the  mean 

A  term  to  express  distance  from  mean  to 
midpoint  of  histogram  intervals  in  units 
of  standard  deviations,  see  equation  (3-^) 

A  term  to  express  the  distance  between 
the  mean  and  another  specified  value  of 
a  frequency  distribution  as  units  of 
standard  deviations 

For  frequency  curves,  those  curves 
that  have  only  one  mode 

The  square  of  the  standard  deviation 

A  measure  of  the  water  present  within 
a  rock  expressed  as  a  percentage  of  the 
total  fluid  saturation  within  a  rock 
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FORTRAN 

PROGRAMS  AND  PLOW  CHARTS 

c 

0000 

0 

650  FORTRAN  PROGRAM  TO 

c 

c 

0000 
0000 

0 
0 

CALCULATE  A  GOOD  THEORETICAL 
FIT  OF  A  PEARSON  TYPE 

+ 

c 

c 

0000 

000  0 

0 
0 

FREQUENCY  CURVE  FOR  A  GIVEN 
OBSERVED  DISTRIBUTION. 

mm 

c 
c 

0000 
0000 

0 
0 

THE  PROCEDURE  TO  BE  FOLLOWED 
1  TABULATE  THE  DATA  USING 

+ 

c 
c 

0000 
0000 

0 
0 

CONVENIENT  CLASS  INTERVALS. 
2  CALCULATE  THE  MOMENTS  ABOUT 

+ 
+ 

c 
c 

0000 
0000 

0 
0 

A  SELECTED  VERTICAL. 
3  TRANSFER  THE  MOMENTS  TO 

c 
c 

0000 
0000 

0 
0 

THE  MEAN. 
4  APPLY  SHEPPARDS  CORRECTIONS 

+ 

c 

c 

0000 
0000 

0 

0 

TO  THE  MOMENTS. 
5  CALCULATE  ALP3S  ALP4  DELTA 

_ 

c 
c 

0000 
0000 

0 
0 

6  LOCATE  THE  MEAN  (DAVG) 

7  DETERMINE  FROM  THE  CHART 

c 

c 

0000 
0000 

0 
0 

WHAT  TYPE  OF  CURVE  TO  USE 
PROGRAM  BY  J  S  VAN  SCOYOC 

c 

c 

0000 
0000 

0 
0 

UNDER  THE  DIRECTION  OF  DR. 
FLOYD  PRESTON 

0 

0000 

0 

DIMENSION  DD( 100) ,G( 100)»UP(8) 

_ 

0 
0 

0000 
0001 

1 

0 

»SUMXG(8) »X(100) »XG(8) 
READ1»  INDEX»NDEX 

0 

0030 

0 

READ1,  DMIN»  DMAX ♦  DELD»DECM 

+ 

c 

0000 

0 

DETERMINE  NUMBER  OF  INTERVALS 

+ 

0 
0 

0040 
0041 

0 
0 

AN  =(DMAX-DMIN)  /(DELD+DECM) 
IF(XCONF(l ) )  42»50»50 

0 
0 

004? 
0050 

0 
0 

PUNCH1*  AN 

L  =  XFIXF  (AN) 

c 

0000 

0 

DETERMINE  CENTER  OF  INTERVALS 

+ 

0 

0060 

0 

DD( 1 )  =  DMIN  +DELD/2.0 

0 
0 

0061 
0062 

0 
0 

IF(XCONF(l ) )62»70,70 
PUNCHU  DD(1  ) 

0 
0 

0070 
0080 

0 
0 

DO  80   I  =  1.  L 

DD(  1+  1)  =DD( I )  +  DELD+1. 

0 
0 

0081 
0082 

0 
0 

IF(XCONFd))  82»90»90 
PUNCH1*  DD 

c 

0000 

0 

COUNT  D,S  IN  EACH  INTERVAL 

c 

0000 

0 

AND  COUNT  TOTAL  D,S  AND  SUM  D» 

0 
0 

00~90 

OlOO 

0 
0 

AK  =  0. 
SUMD=0. 

0 
0 

0121 
0122 

0 
0 

11  =  1 

J  =  l 

0 
0 

0130 
0140 

0 
0 

AJ  =  0. 

GO  TO  ( 150,180) , INDEX 

c 

0000 

0 

READ  IN  DATA 

• 

• 

1 

T 

a 

. 

.        ■ 

' 

' 

1 

-    ' 


•-.<■'-. 
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0     0150   0      READ  It  ID»  I  SAMP »  IDEPT, 
0     0150   1       IPERM 


0000   0     CONVERT  TO  FLOATING  POINT 


0     0160   0     DATA  =  FLOTF  (IPERM) 
0     0170   0     GO  TO  270 


0000   0     READ  IN  DATA 


0     0180   0     READlt  ID*  ISAMP*  IDEPT*  IPHI* 

0     0180   1       IS0*ISW 

0     0190   0     GO  TO(200»220*240)  ,NDEX 


C  0000  0  CONVERT  TO  FLOATING  POINT 

0  0200  0  DATA  =  FLOTF  (IPHI) 

0  0210  0  GO  TO  270 

0  0220  0  DATA  =  FLOTF  (ISO) 

0  0230  0  GO  TO  270 


0  0240  0  DATA  =  FLOTF  (  ISW) 

0  0270  0  IF(DATA-DMIN)140*280,280 

0  0280  0  GO  T0(290*330) *I  I 

0  0290  0  II  =  ? 

0  0300  0  TEST  =  DMIN  +  DELD+DECM 

0  0330  0  IF  (DATA-TEST)  340*380*380 

0  0340  0  SUMD  =  SUMD  +  DATA 

0  0360  0  AJ  =  AJ  +  1. 

0  0361  0  IF(XCONFd)  )362*370*370 

0  0362  0  PUNCH1*SUMD*DATA»TEST»AJ*AK»J* 

0  0362  1  I 

0  0370  0  GO  TO  140 


U~  0380  0  AK  =  AK  +  AJ 

0  0390  0  G(J)  =  AJ 

0  0395  0  AJ=0. 

0  0400  0  J  =  J  +  1 


0     0410   0     TEST  =  TEST  +  DELD+1. 

0     0420   0     IF  (TEST  -( DMAX  +  DECM  )  ) 330 »330 ♦ 

0     0420   1     430 


0000   0     CALCULATE  MEAN 


TT~  0430  0  DAVG  =   SUMD/  AK 

0  0431  0  IF(XC0NF(1 ) )432*440»440 

0  0432  0  PUNCH1*SUMD*AK»DAVG*G 

C  0000  0  DETERMINE  INTERVAL  WHICH 

C  0000  0  CONTAINS  MEAN 


0  0440  0  DMP  =  (DAVG-DMIN)  /(DELD+DECM) 

0  0450  0  MP  =  XFIXF  (DMP)  +  1 

0  0460  0  XMP  =  FLOTF  (MP) 

0  0461  0  IF(XC0NF(1 ) )462»480*480 

0  0462  0  PUNCH1*  DMP*MP,XMP 


0000   0     PLACE  INTERVALS  ON  UNIT  BASIS 


0480   0     DO  490   1=1*8 
0490   0     SUMXG( I )  =  0. 


[ 


I 


" 


■ 

• 

■ 


' 


, 


• 

■ 

■ 

15^ 

0     0495   0     NA=L+1 

0     0500   0     DO  560   I  =  1»NA 


0     0510   0     X(I)  =  FLOTF  (I)-XMP 

0     0511   0     IF(XC0NF(1 ) )512»520»520 

0     0512   0     PUNCH1,X(I) 


C     0000   0     CALCULATF  XG  TO  X8G 

0     0520   0     XG(1)=X(I)*G(I) 
0     0530   0     DO  540  J  =  1,7 


0  0540  0  XGU+1)  =  XG(J)  *  X(I) 

0  0541  0  IF(XCONF(l ) )542,550,550 

0  0542  0  PUNCH1,  XG 

0  0550  0  DO  560  J  =  1,  8 

C  0000  0  SUM  XG  TO  X8G 

0  0560  0  SUMXG(J)  =  SUMXG(J)  +  XG(J) 

0  0561  0  IF(XCONFd))  562,570,570 

0  0562  0  PUNCH1,  5UMXG 


0000   0     CALCULATF  UPRIMFS 


0  0570  0  DO  580   1=1,8 

0  0580  0  UP(I)  =  SUMXG  (I)  /  AK 

0  0581  0  IF(XCONFd))  582,590,590 

0  0582  0  PUNCH1,  UP 


0000   0     CALCULATF  U2  TO  U8 


0  0590  0  U2=UP(2)-UP(1 )*UP(1) 

0  0600  0  U3=UP(3)-3.*UP(1 )*UP(2)+2.* 

0  0600  1  UP(1  )*UP(1 )*UP<1  ) 

0  0610  0  U4=UP(4)-4.*UP( 1 )*UP(3)+6.* 

0  0610  1  UP(1 )*UP(1 )*UP(2 )-3.*UP(l )* 

0  0610  2  UP( 1)*UP(1)*UP( 1  ) 

0  0620  0  SIGMA=U2**.5 

0  0621  0  IF(XCONF(D)  622*630,630 

TT  0622  0  PUNCH1,U1,U2,U3,U4 


C     0000   0     APPLY  SHEPPARDS  CORRECTIONS 

0     0630   0     UM0D2=U2-1./12. 
0     0640   0     UM0D3=U3 


0     0650   0     UMOD4=U4-0.5*U2+7./240. 
0     0651   0     IF(XCONFU))  652,660,660 
0     0652   0     PUNCH1,UM0D2,UM0D3»UM0D4 


0~ 
0 

0670 
0680 

0 
0 

0690 
0690 

0 

0 

076^ 
0770 

0 

0770 

C     0000   0  CALCULATE  VALUES  NEEDED 

0     0660   0  ALP3S=UM0D3*UM0D3/ (UM0D2*UM0D2 

0     0660   1  *UM0D2) 

0  ALP3=ALP3S**»5 

0  ALP4=UM0D4/(UM0D2*UM0D2 ) 

0  DELTA=(2.*ALP4-3.*ALP3S-6. )/ 

1  __(_ALP4+3.] 

0  OFLD=DFLn+DFrM 

0  PUNCH1,NA,N,DMIN,DMAX,DELD, 

1  DAVG 


1 

■  • 

■    . 

- 

.  I 


. 

nil 

■ 

■.--••    - 

I 

.   - 


0  0780  0  DO790  I=1»NA 

0  0785  0  PUNCHlt I »DD( I ) »X( I ) »G( I ) 

0  0790  0  CONTINUE 

0  0795  0  PUNCH1»ALP3»ALP3S»DELTA»SIGMA» 

0  0795  1  UP(1) 

0  0810  0  GO  TO  30 

0  08?0  0       END 
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FLOW  CHART  TO  IDENTIFY  THE  PEARSON  TYPE  CURVE 
CORRESPONDING  TO  A  GIVEN  DATA  SET 
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READ, 

INDEX,  NDEX,  DMIN,  DMAX,  DELD,  DECM 


^L 


AN,  L 


^L 


DD(1)  =  DMIN  +  DELD/2 


1=1 


^kL 


DD(I-M)  =  DD(I)  +  DELD  +  1 


ID,  IS AMP,  IDEPT  ^ 
IPERM 


^L 


DATA  =  IPERM 


JL 


DATA  =   IPHI 


w^ID,  ISAMP,  IDEPT 
IPHI,  ISO,  ISW 


DATA  m   ISW 


V 


DATA  a  ISO 


(  DATA:  DMIn]< — * 


XL 


<2> 


2   'hV1-h 


II  a  2 


"> 


TEST  =  DMIN  +  DELD  +  DECM 


^_ 


< 


@ 


SUMD  =  SUMD'  +  DATA^ — (  DATA:  TEST  1 >   AK  =  AK  +  J 


7\ 


^ 

@^-(  AJ  =  AJ  +  1  )      < 


W 
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G(J)  =  AJ 


C>H  TEST:  DMAX  +  DECM 


J& 


> 


DAVG,    MP,    XMP,    DMP 


Jl 


AJ   =  0 


J   =  J  +  1 


1=1 


TEST   =  TEST  +   DELD  +   1 


± ± 


SUMXG    (I)    =  0 


I:    8 


< 


^ 


1=1+1 


£ 


NA=I+1 


-dl=] 


X(I)=I-XMP 


1=1+1 


7\ 


c     <      / 

1=1 

+1 

c 

f 


■> 


XG(l)=X(l)'G(l) 


-> 


J=l 


cTnaV 


> 


< 


1:8 


< 


XG(J+l)=XG(j)'X(l) 


_^k_ 


UP(I)=SUMXG(I)/AK    S 


1=1 


J=J+1 


SIGMA,  U2,U3,U4 


UM0D2 ,  UMOD^  ,  UM0D4 


>     ALP?,ALP4, DELTA 


PUNCH 

NA,    N,    DMIN,    DMAX,    DELD,    DAVG 


DELD  =  DELD  +  DEC!/ 


_^_ 


1=1 


■^  PUNCH  I,    DD(I),    X(l),    G(I) ~g 


1=1+1 


< 


PUNCH 
ALP?,  ALP35,  DELTA,  SIGMA,  UP(l) 


I:  NA 


FIGURE  31 
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C  0000  0  PEARSON  FIT  FOR  TYPES  1»3»4»6 

C  0000  0  DATA  FOR  READ2»READ3»  AND 

C  0000  0  READ4»  ARE  SUPPLIED  BY  THE 

C  0000  0  IDENTIFICATION 

C  0000  0  NUMBER  (FOR  READ5 » ) =NUMBER 

C  0000  0  OF  DIFFERENT  TYPES  OF  CURVES  + 

C  0000  0  WHICH  ARE  TO  BE  FITTED  TO  THE 

C  0000  0  SAME  DATA  

C  0000  0  NTYPE=CURVE  TYPE  NUMBER 

C  0000  0  NTYPE=1  FOR  PEARSON  TYPE  1 

C  0000  0  NTYPE=2  FOR  PEARSON  TYPE  3 

C  0000  0  NTYPE=3  FOR  PEARSON  TYPE  4 

C  0000  0  NTYPE=4  FOR  PEARSON  TYPE  6 

C  0000  0  _____ 

C  0000  0  THIS  PROGRAM  NEEDS  SINF  AND  + 

C  0000  0  XCONF  SUBROUTINES  AND  SPECIAL  + 

C  0000  0  GAMMA  FUNCTION  ROUTINE 

C  0000  0 

C  0000  0 

_0 0000  0  DIMENSION  DD(50)  >X(50)  »G(  50)  » 

0  0000  1  XX(3) »Y(3) »YNORM(3) »CON(5) ♦ 

0  0000  2  CSTC16) •EU<16)»EZ<16)  » 

0  0000  3  THETA(16) 

0  0001  0  ZERO=0. 


0  0000  0  READ1»CST»EU 

TT"  0000  0  DO  501  1  =  1  »16 

0  0000  0  THETA( I ) =3. 1415927*EU ( I  )  + 

"Q^  0000  1  1.5707963 

0  0501  0  EZ( I )=SINF(THETA( I  )  ) 

0  1011  0  READ2»NA»TOTAL*DMIN»DMAX»DELD» 

~0~~  1011  1  DAVG 

0  0000  0  AN=FLOTF(NA) 

0  0003  0  D04I=1*NA 


0     0004   0     READ3»JtDD( I )tX( I ) »G(I ) 

0     0005   0     READ4»ALP3»ALP3S»DELTA»SIGMA» 


0  0005  1  UP1 

0"~  0006  0  D=ALP3*ALP3-4.*DELTA* 

0  0006  1  (DELTA+2.) 

0  0007  0  IF(D)8»9t9 

0  0008  0  D=-D 


0009   0     DSQRT=D**«5 


0     0010   0     READ5»NUMBR 
0     0011   0     J=l 


0  0012  0  READ6.NTYPE 

0  001?  0  D014I=1»5 

0  0014  0  CON(I)=0. 

0  0015  0  PUNCH1»NTYPE 


.    .    . 

■ 

- 

- 


• 


.         .        .    ■     .  .     . 


.   - 


- 

I 
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0 
0 

0000 
0000 

0 
0 

NPCH=417 

IF(XCONF(l ) 1417,17,17 

0 

0 

0417 
0417 

0 

1 

PUNCH1,NPCH,NTYPE,NUMBR,NA,D» 
DSQRT,AN 

0 

0017 

0 

GO  TO( 18,30,36,18) ,NTYPE 

0 
0 

0018 
0018 

0 

1 

EMl=-( ( l.+DELTA)/DELTA)*( 1.- 
ALP3/DSQRT)-1. 

+ 

0 
0 

0019 

0020 

0 
0 

CONd  )=EM1 

EM2=-( ( l.+DELTA) /DELTA )*( 1.+ 

+ 

0 
0 

0020 
0021 

1 

0 

ALP3/DS0RT)-1. 
CON(2)=FM2 

0 
0 

0022 
0023 

0 
0 

R1=(-ALP3+DSQRT)/(2.*DELTA) 
C0N(3)=R1 

+ 

0 

0024 

0 

R2=(-ALP3-DSQRT)/(2.*DELTA) 

+ 

0  0025  0  CON(4)=R2 

0  0026  0  GO  T0<27,29,29,48)  ,NTYPE 

0  0027  0  C=T0TAL/(SIGMA*(R2-R1)**( EM1+ 

0  0027  1  EM2+1.)*GAMAF(EM1+1. )*GAMAF(EM 

0  0027  2  2+1.)) 


0  0127  0  C0N(5)=C 
0  0028  0  GO  TO  51 
0     0029   0     STOP 


0     0030   0     A=2./ALP3 
0     0131   0     CON(l)=A 


0  0031  0  ASQ=A*A 

0  0032  0  C=A**(ASO)/(SIGMA*EXPFF(ASQ)* 

0  0032  1  GAMAF(ASO)) 

0  0034  0  CON(2)=C 

0  0035  0  GO  TO  51 


0  0036  0  R=ALP3/(2.*DELTA) 

0  0037  0  C0N(1)=R 

0  0038  0  EM=l./DELTA+2. 

0  0039  0  CON(2)=EM 

0  0040  0  S=DSQRT/(2.*DELTA) 

0  0041  0  C0N(3)=S  

0  0042  0  Pl=2.*EM-2. 

0  0043  0  V=-2.*( l.+DELTA)*ALP3/(DELTA* 

0  0043  1  DSORT) 

0  0044  0  C0N(4)=V 

0  0146  0  TEMP1=2.*EM-1. 

0  0000  0  SUM=0. 


0  0000  0  DO  502  1=1,16 

0  0000  0  Y=( EZ( I )**R)*EXPEF(V*THETA( I ) ) 

0  0000  0  SUM=SUM+Y*CST ( I ) 

0  0502  0  FRV=EXPEF(-1.5707963*V)*SUM* 

0  0502  1  3.1415927 

0  0046  0  C=T0TAL*S**TEMP1/(SIGMA*FRV) 


0     0246   0     C0N(5)=C 
0     0047   0     GO  TO  51 


0  0048  0  ALPHA=R1-R2 

0  0049  0  C=T0TAL*GAMAF(-EM2 )/(SIGMA* 

0  0049  1  GAMAF(EM1+1. )*GAMAF(-EM2-EM1- 

0  0049  2  1. )*ALPHA**(EM1+FM2+1. > ) 

0  0050  0  C0N(5)=C 


■    .    ■  . 


] 


•    - 
- 


■  , 


T  I 

.  ■  ■    ■ 

K  •  I  ]       . 

I 

I 


- 


i 

I 


loO 

0 
0 

0051 
0000 

0 
0 

CHIS0=0. 
NPCH=452 

0 
0 

0000 
0452 

0 
0 

IF(XC0NF(1 ) )452»52»52 
PUNCH1»NPCH»C0N  ' 

0 
0 

0052 
0053 

0 
0 

BELSO=0. 
XZERO=X( l)-.5 

0 
0 

0153 
0054 

0 

0 

PR0P=DMIN+DELD*.5 
1  =  1 

0 
0 

0055 
0056 

0 
0 

XX( 1  )=XZERO 
XX(2)=X( I ) 

0 
0 

0057 
0058 

0 
0 

XX(3)=XZER0+1. 
L  =  l 

0 
0 

0059 
0060 

0 
0 

XBAR=XX(L) 
T=(XBAR-UP1)/SIGMA 

0 

0 

0061 
0061 

0 

1 

YNORM( L)=. 398942 1*EXPEF ( -T*T / 2 
•  )*TOTAL/SIGMA 

0 

0062 

0 

GO  T0(63»68»70»65) »NTYPE 

0 

0 

0063 
006? 

0 

1 

YBAR=(C*(T-R1 )**EM1 )*( (R2-T)** 
FM2) 

0 

0064 

0 

GO  TO  91 

0 
0 

0065 
0066 

0 
0 

Z=(T-R2) 
YBAR=C*Z**EM2*(Z-ALPHA)**EM1 

_ 

0 

0067 

0 

GO  TO  91 

0 
0 

0068 
0068 

0 

1 

YBAR=C*(A+T)**(ASQ-1. )*EXPEF 
(-A*T ) 

+ 

0 

0069 

0 

GO  TO  91 

0 
0 

0070 
0071 

0 
0 

TR=(T+R) 
ARG=TR/S 

0 
0 

0072 
0000 

0 
0 

XSQ=ARG*ARG 
C0E=3. 

0 

0 

007? 
0175 

0 
0 

IF(XSQ-1.)83»175»76 
TANM1=. 78539816 

0 
0 

0275 
0076 

0 
0 

GO  TO  89 

TANM1=1. 5707963-1. /ARG 

0 
0 

0077 
0078 

0 
0 

Q  =  l. 
D081I=1,9 

0 
0 

0079 
0080 

0 
0 

TANM1=TANM1+Q/ ( ARG*XSQ*COE ) 
XSQ=XSQ*ARG*ARG 

+ 

0 
0 

0000 
0081 

0 
0 

C0E=C0E+2. 
Q  =  -Q 

0 
0 

0082 
0083 

0 
0 

GO  TO  89 
TANM1=ARG 

0 
0 

0084 
0085 

0 
0 

Q  =  -l. 

DO  881=1*9 

0 
0 

0086 
0087 

0 
0 

TANM1=TANM1+Q*ARG*XSQ/CCE 
XSQ=XSQ*ARG*ARG 

0 
0 

0000 
0088 

0 
0 

C0E=C0E+2. 
Q  =  -Q 

— 0" 
0 

0089 
0090 

0 
0 

ARG1=V*(1 ,570796 3-TANM1 ) 
YBAR=C*( (TR*TR+S*S)**(-EM)* 

- 

Or 

0090 

1 

EXPEF(ARGl) ) 

0 

0 

0091 
OOOn 

0 
0 

Y(L)=YBAR 
NPCH=492 

0 

0000 

0 

IF(XCONFd)  )492»92»92 

- 

I 

I    1 

.- 


.  I 


I 

- 


I       8. 


■ 

I 

■  ■ 

■ 


■ 


• 


0492 
0492 
0092 
0093 
0094 
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0  PUNCHl»NPCH»I»L»XX»YBAR»Tf 

1  YN0RM«XSQ»TANM1 
0  IF(L-3)93»95»95 
0  L=L+1 

0  GO  TO  59 


0 
0 

0095 
0095 

0 

1 

YNAVG=(YNORM( 1 ) +YNORM ( 3 ) +4«* 
YNORM(2) )/3. 

+ 

0 

0 

0095 
0096 

1 

0 

YNORM(2) )/6* 

YAVG=(Y( 1)+Y(3)+4.*Y(2) )/6. 

+ 

0 
0 

0097 
0097 

0 

1 

PUNCHl»IfX{ I ) »PROP»G( I ) ,Y(2) » 
YAVG»YNORM(2 ) 

0 

0098 

0 

IF( I-NA)99»111»111 

0 
0 

~0099 

0100 

0 

BSQ=G( I )-YNAVG 
BSO=BSO*BSO 

0 
0 

0000 
0101 

0 

0 

IF(YNAVG)  101«102»101 
BELSQ=BFLSQ+BSQ/YNAVG 

0 
0 

0102 
0103 

0 
0 

CHI=G( I )-YAVG 
CHI=CHI*CHI 

0 

0000 

0 

IF(YAVG)  104,105,104 

0 
0 
0 


0 
0 

0 
0 
0 


0104 
0105 
0106 
0107 
1107 
0000 
0000 
4109 
4109 
0109 
0110 

0111 
0112 

0112 
0113 
0114 
0115 
0116 
0118 


0119 
01?1 
012? 
012** 


0 
0 
0 
0 
0 
0 
0 
0 

1 

0 
0 

0 
0 

1 

0 
0 
0 
0 
0 

0 
0 
0 
0 


CHISQ=CHISQ+CHI/YAVG 
XZERO=XZFRO+l. 
Yd  )=Y(3) 
YNORMd  )=YNORM(3) 

PROP=PROP+DELD      

NPCH=4109 

IF(XCONF(l ) ) 41 09*109, 109 

PUNCH1»NPCH»I  tl_»CHI  •CHISO»BSQ» 

BELSO 

1  =  1+1 

GO    TO    55 


PUNCH1,ZER0 

PUNCH1»DMIN»DMAX»DELD»DAVG» 

SIGMA,ALP3,DELTA 

PUNCHUZERO 

PUNCHl«CONtCHISQ»BELSQ 

IF( J-NUMBR)116,121 tl21 

J  =  J  +  1 

PAUSE 


GO    TO    12 

PAUSF 

GO    TO    1011 

FND 


• 

- 

•v 

I 

... 

[ 

T 

c 


GENERALIZED  FLOW  CHART  FOR  CURVE  FIT  TO 
PEARSON  TYPE  DISTRIBUTION 
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READ 


TABLE  FOR  F(RV) 


READ 
NA,  TOTAL,  DMIN,  DMAX,  DELD,  DAVG 


XL 


READ 
J,  DD(I),  (I),  G(I) 


READ 

ALP3,  ALP3S,  DELTA.  SIGMA.  UP1 


READ 

NTYPE 


CALCULATE  NECESSARY  FIXED 
PARAMETERS  AND  CONSTANTS  C 
FOR  SELECTED  NTYPE 


\/_ 


1=1 


€- 


XL 


<D 


CALCULATE  PROP,  Y(2), 

YAVG,  YN0RM(2),  FOR 

EACH  INTERVAL  I 


Q 


0 


l6j> 


PUNCH 
I,    X(l),    PROP,    G(I),    Y(2),    YAVG,    YN0RM(2) 


< 


CALCULATE  CHI  AND  BSQ 
FOR  EACH  INTERVAL  AND 
SUM  TO  GIVE  CHISQ 
AND  BELSQ 


1=1+1 


© 


I:    NA 


> 


v 


PUNCH 

DMIN,    DMAX,    DELD,    DAVG, 
SIGMA,    ALP2,    DELTA 


\/_ 


PUNCH 

CON,    CHISQ,    BELSQ 


FIGURE  32 


EXPLANATION  OF  PLOW  CHART  SYMBOLS 
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A  Read  or  Punch  instruction. 
Data  are  read  or  punched  in  the 
order  given,  left  to  right. 


An  arithmetic  calculation. 


B 


A  test  of  the  difference  A  -  B 
for  negativity,  positivity,  or 
zero.   The  program  branches  to 
the  next  appropriate  calculation 
sequence  depending  upon  the 
numerical  value  of  this  difference, 


A  multi-way  branch  depending 
upon  the  numerical  value  of 
the  inclosed  index. 


A  jump  is  made  in  the  flow 
chart  to  the  circle  with  the 
same  enclosed  letter. 


FIGURE  35 
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._    ...     . 

14.     15  •     17 • 

APPENDIX  C 

57. 

52-      31* 

20* 

56  •    11  •     15*     19* 

, 

28  •    27  •     26  •    50  • 

10*        29  •            25  • 

39  •    40  •      9»    22 • 
58  . 

55*    54* 
33-       7. 

12» 

5* 

18  • 

4«       !♦     24  • 

8  •       2  •     21  •     25  • 
l6» 

6 .                             SCALE 
1"    =  800' 

5. 

FIGURE  34  MAP  OP  WELLS  CORED— FIELD  1 
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10* 

!• 

• 

2« 

5* 

4% 

y 

6. 

9 

7  • 

12» 

8* 

• 

14. 

\. 

SCALE 

11« 

1"    =  400 ■ 

13  • 

FIGURE  35  MAP  OP  WELLS  CORED— FIELD  2 


1  • 


2« 


3*  4  • 


7  # 


SCALE 
1"  =  400' 
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FIGURE  56  KAP  OP  WELLS  CORED— FIELD  3 


